Amazing!
As a follow up to yesterdays post, a howto for creating the LatinIME dictionary (in this case raw-nl/main.dict) files!
All info necessary to do so is readily available on http://code.google.com/p/softkeyboard/wiki/BinaryDictionaries
As a base, you need a wad of decent text, the more the better! The quality of the dictionary will improve as more data is fed into the file. The example on SoftKeyboard uses the Wikipedia for bulk text!
The dutch dump is available on http://download.wikimedia.org/nlwiki/20100813/ Considering the different dumps, I chose the one containing the Articles, templates, image descriptions, and primary meta-pages, but not the revision or user data..
2010-08-13 16:17:13 done Articles, templates, image descriptions, and primary meta-pages.
2010-08-13 16:17:13: nlwiki 1003105 pages (382.643/sec), 1003105 revs (382.643/sec), 92.5% prefetched, ETA 2010-08-13 16:59:24 [max 1971556]
This contains current versions of article content, and is the archive most mirror sites will probably want.
pages-articles.xml.bz2 549.0 MB
The SoftPedia explenation offers the following bash script to analyse the wad of text into a weighed word list.
My blog screws up code, check gertschepens.be or the original page for the commands
Code from Softkeyboard, contributed by Jacob Nordfalk
This creates a weighed and sorted list of the words in the file. The more data, the more reliable this set will be. (The database dump of a dutch forum would be a nice addition to get more of the common language in there, as a counterweight to the dictionary wording.)
The weighed values are between 194 and 8671269 (this is the # of times a word is found in the text) in case of this specific export. The sample.xml however speaks of a frequency value from 0 to 255.
So I created a perl script to fix the numbering; the script needs a sorted list, frequent words first, infrequent ones later. It cycles trough and replaces the initial numbering (in the correctly formatted xml) by the weighed alternative.
The source is not available here for formatting reasons; click the Perl script link!
Once we have a decent .xml, all we need to do is convert it to the necessary .dict format, which is described in the Softpedia text; and the necessary software is in their repositories.
After this, we have a valid main.dict file, ready to be compiled into a LatinIME pack. Compiling all languages into the pack will not be possible due to very limited space on the system partition (My magic has about 1 MB free space on its /system partition), so a solution will have to be found to store the dictionary files out of the LatinIME keyboard apk. (which is preferable at any rate!) My dutch dictionary was submitted to CM but will probably not make it in until the external/internal dictionary problem is solved! (in the mean while, Im using a home-rolled version! The default CM keyboard with added raw-nl, currently available here)
Where are the Keyboard Dictionaries in #Android?
Posted: 19th August 2010 by Gert in Android, Planet
I love the Froyo multiple languages keyboard feature! Its AWESOME, sliding a finger over the keyboard to change language .. Awesome!
If your specific dictionary is in there. Which it probably isnt, since there are only 6 dictionaries in the default keyboard. And Dutch isnt in there. Dutch is in several other keyboards, like the HTC one, but sadly there is no shared space where these dictionaries live, in the contrary, they all reside in the .apk that contains the keyboard.
By opening /system/app/LatinIME.apk (as found in CyanogenMod), we find out that the dictionaries are in the .apk under the res directory. While we’re there, someone mentioned the availability of more dicts might be a size issue, but I dont think so since they’re all quite moderate in size:
- raw-de: 739K
- raw-en: 822K
- raw-es: 768K
- raw-fr: 775K
- raw-it: 688K
- raw-sv: 911K
Also, the custom words are saved in a database at /data/data/com.android.inputmethod.latin/databases/auto_dict.db
Now, looking to add a dutch dictionary, I went looking in the AOSP. The LatinIME source is in ./packages/inputmethods/LatinIME and contains a ./packages/inputmethods/LatinIME/dictionaries directory. I expected to find the dictionary files there, but it only contains a sample.xml file. So no .xml dictionaries. The aforementioned res directory is at ./packages/inputmethods/LatinIME/java/res/ but contains none of the raw-lang directories.
The dictionaries do not appear to be part of the AOSP. I guess Google is not able to open source these?
While searching (the interwebs and IRC) I also discovered that a lot of other people (Issue 1827: add dictionaries for other locales (or make it easier for users to do so) – Im d.gen… in that thread) were looking to add their language to the code tree and that some people had solved the problem by just rolling a custom LatinIME (Softkeyboard). I dont like that option, however since I’d rather strengthen the default tree instead of splitting from it and updating the code after each AOSP update.
CyanogenMod however does have the dictionaries and so I checked out the CyanogenMod tree. it took me a while to find out where they were as I was expecting them to be in the paths I mentioned before, but no such luck. Apparently (and this makes sense,) the CM specific files are in the ./vendor/cyanogen directory, the binary dictionaries in
./vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res/raw-sv/main.dict
./vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res/raw-de/main.dict
./vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res/raw-fr/main.dict
./vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res/raw-en/main.dict
./vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res/raw-it/main.dict
./vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res/raw-es/main.dict
So. How to add new dictionaries to Android?
- Either in AOSP under packages/inputmethods/LatinIME/dictionaries/ as .xml files (preferable),
- under packages/inputmethods/LatinIME/java/res/ as binary files
- or if that turns out to be impossible for some reason, adding them to CM in the respective vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/ directories (so that at least CM has the extended languages).
The ideal situation however would be to split the dictionaries from the keyboards and put them somewhere where any application might use them; making it possible to install new dictionary packs (eg DutchDictionary.apk) from the market, thus solving the whole dictionary problem. Maybe by adding them in /data/data/com.android.inputmethod.latin/databases/ although there is probably a performance reason for them not being there in the first place?
Anyhow, in the mean while, we do need the dictionaries! So lets not wait for this and add the data to AOSP now
Im wondering though why the Softkeyboard people dont add their dictionaries to AOSP.. I do see the benefit of keeping it available in the market as a separate app, ie making it available to every android user instead of those “few” tinkerers running the latest AOSP (or mods based on it.)
I dont get it. Telenet has this Wonderful detail view. Im not on the unlimited plan and I can see my downloading habits in this great graph, How much (yeah, thats 83G this month), what type of volume, .. Its all there.
Then Why the Fuck do I have friends, paying more than I do, NOT getting that info?
The answer seems very simple.. because they dont want us to know exactly what the “Fair Use” numbers are. Keeping a foot in the door to stop anyone under the flag of data usage reasons (Your arrow points red, You only get 10Gb for the rest of the month,) whenever they like. Or can anyone point me to an official reason?
And oh yeah, you re paying more to have them treat you like that.
The internet is a creative and often cruel place!
Posted: 17th August 2010 by Gert in Interwebs, PlanetAnd documenting the most popular and often most cruel pieces of creativity is http://knowyourmeme.com/
And there are some AMAZING things on here. Do read the dedicated pages however, the base video might be worth watching, but the epic part is what happens with it afterwards. Peoples reactions, follow up videos, remixes or parody..
From the very funny
Lying Down Game is a mass-participatory meme that involves having one’s photograph taken whilst lying rigidly face down in public space and then sharing the image via internet. Since becoming popular through Facebook around June 2009, the viral game has spawned thousands of photos of people lying flat in public landscapes across the world. The official Facebook Group hosts over 19,000 pictures.
to the Extremely Cruel (but, forgive me, also very funny)
Jessi “POP A GLOCK AND MAKE A BRAIN SLUSHY” Slaughter
What happened next?
Check knowyourmeme.com: Jessi Slaughter
Its brutal. Its what happens when a 11yo goes wild and unsupervised on YouTube. You done goofed!
and
Antoine Dodson
What happened next?
Check knowyourmeme.com: Antoine Dodson
All n all the guy is some sort of a hero. He saved his sister and Im very glad he s taking the whole thing very well. Also the rap Rules
Kudos to Kidstatic and the Gregory Brothers! Hide ya kids, hide ya wife and hide ya husband too! Hillarious!!!
And our own Kimberley Vlaeminck is on there too!
Gotta love Youtube. Gotta love 4chan
In the end, “know your meme” is a cautionary tale to people around the world. And I guess, an extra reminder to parents too.. Watch the fuck out with the internets, coz once its on there, its on there! And if its a big cockup; EVERYONE ll know about it, everyone will have an opinion about it. And you can be sure there ll be some crazy ass derivative stuffs made on top of it!
Its great to see the worlds population create with whatever mad technologies are thrown out there! Its like a million monkeys with a typewriter and one of them accidentally writing Shakespeare. But with the internet bringing it to our homes and the rest of the monkeys flocking towards the awesome, weeding out the crap!
Had another unit lying around, gathering dust. Mini portable servers are an incredible source of potential fun.. if you can do with them what you need to. And that means not being locked down to the firmware. Initial firmware was 0.7.2 r3; which sucks. Luckilly, after a hardware reset, the software reverted to the more handle-able 0.7.1 r1
After some research, it turns out this Fon version has a locked down RedBoot (the software that makes the trick in the previous post work.) Unlocking this means getting ssh access (today trough a sad code injection exploit – are they Serious?? Luckily for us exploiting fuckers, they are.) – Once connected over ssh we need to upgrade the kernel to a version that will allow us to alter the RedBoot config. And a reboot later we need to actually update the config. After that its back to the old automated method as used in the earlier post.
Life can be simple. The whole thing in links:
- Getting SSH; an Injection Exploit story: http://blog.blase16.de/2006/11/28/Hacking-Fonera
- Enabling telnet into RedBoot: http://wiki.openwrt.org/toh/fon/fonera#enabling.telnet.into.redboot.without.serial.access
- The previous post with the links to the necessary stuff to automagically update trough RedBoot: http://blog.cone.be/2010/08/11/openwrt-ing-my-fonera/
There, Another Kamikaze up n running.
And tomorrow its time to have some Wireless fun!
Wait, Stop, Hold the flaming, I know. Its slightly unethical to do so. But, the Fon2200 is utterly unusable in its usual, locked down version. I dont need the NAT (au contraire, its extremely annoying in my setup); I need more options to separate the fon and the private networks; .. Not to mention that I was tired of the lots of connection problems. So it was throwing it out for scrap – or flashing the little bitch!
So Open-WRT it is! Sadly I soon learnt that Fon has due protections against exactly that. A hardware reset and login later I discovered I was rocking fon2200 0.7.1 r5 firmware and a google search later I learnt that everything I read so far (including the fonera hacking history lesson) wouldn’t work.
Luckily, the interwebs helped out.
- Always good to know what you re fucking with: OpenWRT Fon2200 details
- DD-WRT easy Fonera Flash method: FON NEW KERNEL >= 2.6.21.1 SPECIAL FLASH METHOD
- The easy flash software for Linux: http://fon.testbox.dk/flashing/GUIflasher/linux/
- (the flashing guide is on there too http://fon.testbox.dk/flashing/flashing_guide.txt )
- The info to get it running on 64-bit linux [fonera] Flasher une fonera sous Ubuntu 64bits avec ap51-flash (yeah, thats French – The upside? Its not rocket science, merely French
) - The necessary files to flash with are on the aforementioned OpenWRT site (you could also consider DD-WRT)
All this and a slight panic later
I fear Ubuntu network manager just screwed up my openWRT update by resetting eth0 mid procedure.. ah, it just continued. #CloseOne
Gen
I was rocking my sleek new OpenWRT install
(yes I do realize that last link doesn’t go anywhere sensible for you.)
Next up – testing how good it works as a wifi extender style thingy (ie serving a Wifi network, maybe even a second and having it pass trough to a different Wifi network) to solve the connection problems in some parts of the house. (another thing the original firmware wouldn’t do) and afterwards, setting it up as an advanced wifi access point for the in-house network. Hooray!
The SAGE ends today (I received my present earlier and its Awesome
tnx squidman!) and as such I present to you my personal SAGE gift. And to borrow Zaks words.. “Happy birthday Ghost of Gary, we still like your game…and thanks for coming to the party, Ghost of Dave, we couldn’t have done it without you.”
The request
> I want a monster that’s Lovecraftian- not Cthulhu-thing Lovecraftian,
> but rather Nyarlathotep-style subversion-of-men Lovecraftian. There
> should be no relation of any sort to tentacles. Pictures not
> necessary, but a write-up and basic stats/powers should be enough.
> Stats preferably 3.5 rules.
I first started off with a strong Nyarlatothep theme, but decided (after it was 50% completed) that I needed more of a monster and started over. The overall Cthulhu theme as I see it (the gods etc) is mostly about a huge and incomprehensible evil lurking over humanity. Not much is known, but most of that is very fearsome. Nyarlatothep kicks it up a notch (BAM!) as it disguises the evil in every day forms and even takes away the recognisable “pure evil” actions and puts more mundane ones in place that are equally devastating in the long run.
I had a long thought about a disguised evil with enormous proportions. Ever looming and impossible to destroy. So Hathédru was born! (I hope he’s what you were looking for!
)
Hathédru
Hathédru is an ancient god, slumbering deep underground and only sporadicly manifesting itself in the cities of humanoids though never without disastrous results.
Supported by a well spread and well organised secret cult believing they receive the will of Hathédru trough their elder Sages.
Or at least thats what some believe. A small faction supports the less popular idea that describes the Hathédru as a sentient hive of bugs, the hive mind often combining 1000 of bugs (and a host of magical racial treats) to mimic humanoid forms.
Read the rest of this entry »
After the dissapointments and sparkle of hope I talked about in “Updating the Hero” earlier, I kept a close watch on the site with that crontab I described. (damn, what machine is that running on again? .. probably the Soekris…) The crontab yielded results several times, but it was never the update I was waiting for. Tired of waiting, I mailed the HTC guys again and got the following reply today!
http://www.htc.com/europe/SupportDownload.aspx?p_id=283&cat=2&dl_id=996 pobeert u de bovenste link met uw serienummer dit zou moeten werken
Approximate Translation: Please try the above URL, this should work for your serial number
So I did and the download worked! I now have the necessary files and am preparing the Hero for updating. I dont like that it requires Windows, but Virtualbox up and running solved that too.
After some backing up (the upgrade apparently destroys all data and you cant be too sure so I backed up the SD too.) we re ready to Go! (More breaking news as the matter evolves!)
I guess the whole HTC sync to Virtualbox doesnt work as easy as I hoped – Sync and the phone dont find each other as they should.. I guess Ill have to find a Windows computer
But at least I now have what should be a working upgrade file for my wifes Belgian HTC Hero :s
As a first in the “Conversations” series, (yes series, because I am SO interesting that I write interesting shit Constantly; even in emails to random people who I sometimes dont even know or have reason to mail!) We present to you, The Iphone Question. (yes, that was a Royal We!)
Today, part of a conversation with a friend considering the HTC Wildfire and asking the iPhone question. A long answer ensued (even though by then he had already decided to buy the Wildfire, ruling out the iPhone quickly after seeing the outrageous price..) No better time to spew my own outrageous opinions!
En waarom is een iPhone dan eigenlijk zoveel slechter dan een Android? Concreet dan…
pf, er zijn allerhande problemen met de iPhone. Iedereen vind dat een fantastische telefoon maar in se is dat enkel omdat er een appel op staat en omdat Stevie zegt dat t DE telefoon is.
De iPhone 4 lost al enkele dingen op.. Zoals multi tasking.. small stuff zoals het feit dat ge nu eindelijk icoontjes in een mapje kunt steken op uw desktop. Ge kunt nu ook filmkes maken met uw telefoon; als ik me niet vergis kunt ge zelfs thethering (uw GSM data abbo gebruiken om over wifi met uw laptop of whatever op te surfen) doen, als ik het goed voor heb is het zelfs recent ook mogelijk om den wallpaper van uw telefoon te veranderen!!! …
Wat ge overgens Niet kunt .. Ge kunt geen batterij van vervangen, geen SD kaartekes in steken/vervangen, geen widgets (klok, kalender, weerbericht op desktop) en dan hebben we t nog ni eens over hun recente “Revolutionair Vernieuwende Antenne” blunder en (want in se is die blunder niet zo spectaculair) hun degoutante en arrogante reactie daarop.. iPhone werkt ook nog steeds met die vervelende proprietary kabels daar waar HTC met standaard mini-USB kabels werkt. En dan is er die vervelende iTunes nog; Geen pc nodig voor Android!
In se zijn er hopen kleine dingen die die telefoons gewoon achter doen zitten op de huidige tech..
Het was indertijd ne goeien telefoon, maar er zijn ondertussen betere telefoons (android telefoons) en als ge de prijs in rekening brengt hebt ge een Highly Overpriced piece of Shit – Verkocht als goud door goeie marketting en een leger van gebruikers met Stockholm Syndroom.Then again, ik heb nog geen paar weken met een apple foon rond gelopen. … het is frustrerend om iedereen zo over die iPhone te horen preken alsof het gods linker testikel is. Een vraag zonder neutraal of echt kort antwoord dus..
Weete; een deel van het bovenstaande staat ook hier samen gevat:
iPhone4 vs HTC Evo http://www.youtube.com/watch?v=FL7yD-0pqZgNog meer Then Again – [iPhone user name omitted to protect the victims] geniet met volle teugen van zijn iPhone stockholm syndroom .. dus fysiek onaangenaam kan het niet zijn. Anderzijds, na de procedure is een frontale lobotomie blijkbaar ook niet zo erg..
To highlight a few high points ..
- “Everyone thinks [the iPhone is] a great phone but in truth thats only because there is an apple on the device and because Stevie says its THE phone.”
- “[The iPhone is] a Highly Overpriced piece of Shit – sold for gold by good marketing and an army of users suffering Stockholm Syndrome.”
- “Its frustrating to hear everyone preach about the iPhone as if it were Gods left testicle”
- “Then again, [iPhone Users] fully enjoy their iPhone Stockholm syndrome .. so it cant be physically unpleasant. On the other hand, once the procedure is complete, a frontal lobotomy isn’t unpleasant either.”
And I didn’t even mention the porn.





