Friday, August 29, 2008

(Smart)Find for Firefox 3.*

Inspired by our own need and other's, a couple of friends (André, Roberto and Tomaz) and I decided to increment Firefox' default find feature.

Story

If you use Firefox, you had probably used the Firefox' findBar already. It is the simple and intuitive way users have to search for words in a web page. The main thing we "miss" is that it requires the user to type the exact (set of) word(s) he searches for to actually find it. Sometimes, however, lazy enough users *like us* want to just give the findbar a "clue" of what we want to find in the page, and even often just do not know. The result of that was the SmartFind addon for Firefox 3.x. From its site we have some usecases of SmartFind:

  1. "Schwarzenegger". This looks easy when you are looking at the word, but what if you do not have any idea what to write? Then you wonder "I would like to write something like "chuazenger" and for it to find similar words in the page".
  2. As another example, when you are on a page where accents are used and your keyboard is not configured to type such accents (~^`), the traditional find method will not be able to match with any word.
  3. You are just lazy like us and terrible in orthography :)
HOW

Smartfind ranks how similar each word in the page (yeah, that might take time depending on the webpage ... heuristics exists to speed it up, but new ones are welcome) is to the given word (input by the user). For such, we are implementing the levenshtein distance metric. The "how similar" can be trigged by the user through the Edit->SmartFind menu, to be more or less restrictive.

Well, in practical terms, whenever Firefox' original find method does not find something, SmartFind takes place and hitting a aditional "enter" fires its actition ...

If it is still not clear, see sshots bellow:




User is not from Austria and does not know how to spell this-big-name-above.

SmartFind finds it for him.

Disclaimer: This is just the first and quickly implemented public release, so there are some known problems:

  • Improve the way it gets text content from the webpage (better xpath expression, which ignores "object" , "style" , "script" tags' text content).
  • Fix problems with line break (lack of "br" tag) .
  • Implement an user-intuitive way to walk through the list of most similar items found: currently we are static to the top of the list.
  • Implement a fuzzy third state (similars ?) while comparing chars that could make SmartFind work much better: "for example, consider 'ë' as not completely different from 'e' or 'é' but similar". That done, the Finnish word "päivää" would be more similar to "paivaa" and so on ...
  • Port it to Fennec (Firefox-Mobile) browser.
  • Make it work with other Firefox addons, including FindInTabs.
  • Implement SoundEx into it (?).

--Antonio Gomes
tonikitoo at gmail dot com



Thursday, August 21, 2008

Open source browsers war: Mozilla and Webkit raw builds numbers on Maemo (ARM)

The big picture

Bosses are only pleased when they get numbers and charts in reports on top of their tables. At this round, we (André, Diego, Fernanda and I) had to benchmark Mozilla and Gtk-Webkit on arm, and so we did ... Text below is an informal summary of the official report.

Scenario
  • Experiments were performed using Nokia N810 Internet tablets (400MHz CPU and 128Mb of RAM memory) with Chinook installed.
  • Applications under test were GTK+ embedding sample browsers from both Mozilla (TestGtkEmbed - available from the mozilla-central repository) and WebKit (GtkLauncher - from the WebKit trunk), cross-compiled in Scratchbox 1.0.8, gcc-2005q3-2.
About the tests

These were the items measured:

1. Page load speed

The page load test focuses on measuring the absolute and relative time needed by each browser to open a given set of locally stored webpages. The test is driven by a JavaScript script that fires the load of each page in the test set. Both the time speed for each Web page to load (relative time) and the total time taken by loading the whole set (absolute time) are measured. The test set is formed by 37 real-world webpages fetched using the httrack crawler and is accessible over an Ad-hoc wi-fi network via the Apache Web server.

2. Memory and CPU consumption of 1)

The goal of this test is to evaluate how well browsers manage system memory during the page load test (above): virtual and physical memory allocated by each browser were monitored and printed out in a comparing chart. As such, a Bash script was developed to "watch" system memory and CPU numbers. It is basically a timer used to poll at every 1 second the browser's virtual and physical memory values from "/proc", and CPU usage from "top" (in batch mode).

3. Javascript engine performance.

Both browsers were ran against Dromaeo and SunSpider Javascript test suites and memory consumption during these tests was measure (same way as in 2).

4. CSS compliance

Both browsers were ran against Acid3 CSS test suite.

Results and Charts

As there is no such general browser benchmark suite (is this even possible ?), we just developed your own tools (bash and JS) for those item above that do not have renowned public benchmarks available (pageload and resource consumption).

ps: I tried Mozilla's Talos suite, which seems fine to test Mozilla but it is not portable for other browsers.

1. Page load speed

Mozilla's and Webkit's page load speed against the given pageset (see table below).

Individual pageload speeds (in ms).

DISCLAIMER
  • Original page set was formed by 85 webpages, from Talos Mozilla Test Suite. Although TestGtkEmbed ran well through the entire original testset, GtkLauncher always got OOM-killed after a while running the page load test, due to lack of memory (see memory chart in Memory Consumption section). Then, from the original 85 webpages, 37 were chosen to make GtkLauncher to finish this test.
  • 10 out of 37 of the remaining webpages in the pageset contain non UTF-8 characters (Russian, Japanese, Chinese, ...). While WebKit misrendered most these fonts, Mozilla went fine for all. Example showed below:
www.3721.com in TestGtkEmbed

www.3721.com in GtkLauncher

2. Memory and CPU consumption during 1)

Memory consumption while doing page load test in 1).

UPDATE: VIRTUAL AND PHYSICAL MEMORY LABELS ARE CHANGED HERE.

CPU load while doing page load test in 1).

3. Javascript engine performance.

Dromaeo numbers.
SunSpider numbers.

Browsers memory use while doing Dromaeo.

Browsers memory use while doing SunSpider.

4. CSS compliance

Mozilla on Acid3.

Webkit on Acid3.

Conclusion

Some outstandings from the numbers:
  • Mozilla managed better memory while doing page load tests, although Webkit was faster. It might probably been had affected by the fact that Mozilla rendered well all non-western chars, Webkit fails.
  • Webkit was faster and used less memory while doing both Dromaeo and SunSpider test.
UPDATE: Some things that have to be pointed out about my Mozilla build are:
  • I do not jemalloc enabled, but would love to.
  • Mozilla guys are doing some amazing job on speeding up their Javascript engine: tracemonkey will probably get things much (5x at least ?) faster.
ps: I personally would not mind to do a 2nd recond of tests and charts w/ these two items above enabled in my Mozilla build.


--Antonio Gomes
tonikitoo at gmail dot com

Wednesday, August 13, 2008

Prism for maemo updates

The Mozilla Prism guys mfinkle and plasticmillion are doing a great job heading to 1.0, and happily we are about to start to get prism changes for maemo upstream.

Below some nice sshots of the current prism maemo port, prism-maemo_0.0.7-1 (which is just out), running on chinook.




Prism on meebo.com


Prism on mibbit.com


--Antonio Gomes
tonikitoo at gmail dot com

Tuesday, August 05, 2008

Mozilla/Firefox Summit 2008

So, the much anticipated Mozilla Summit 08 has just finished, and regardless some unforeseen, I truly believe that most of the 400 (+-) attendees had an amazing time, as so did I. Follows some of my highlights:
Some not accomplished things after the event:
  • I wish I had met Daniel Glazman and John Resig.
  • I wish to have had a shorter trip back: 7hs (from whistler to vancouver) + 5 hous waiting + 5 hous from vacouver to toronto + one hour waiting + 12 hours from toronto to são paulo + 12 hours waiting + 4 hours from sp to manaus = 46 !
Long life mozilla ...

--Antonio Gomes
tonikitoo at gmail dot com