Skip to content

Instantly share code, notes, and snippets.

@p01
Forked from Prinzhorn/LICENSE.txt
Created October 3, 2011 21:55
Show Gist options
  • Save p01/1260365 to your computer and use it in GitHub Desktop.
Save p01/1260365 to your computer and use it in GitHub Desktop.
American Soundex in 146 bytes

American Soundex in 146 bytes

// annotated code will come soon
function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=+'1230120022455012623010202'[j-98]||'';return(r.replace(/(\d)\1+/g,'$1')+'000').slice(0,4)}
function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=+'1230120022455012623010202'[j-98]||'';return(r.replace(/(\d)\1+/g,'$1')+'000').slice(0,4)}
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2011 YOUR_NAME_HERE <YOUR_URL_HERE>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
{
"name": "americanSoundex",
"description": "An implementation of the American Soundex algorithm.",
"keywords": [
"soundex",
"american"
]
}
<!DOCTYPE html>
<title>American Soundex</title>
<div>Expected value: <b>R163,A500,B556</b></div>
<div>Actual value: <b id="ret"></b></div>
<script>
// write a small example that shows off the API for your example
// and tests it in one fell swoop.
var myFunction = function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=+'1230120022455012623010202'[j-98]||'';return(r.replace(/(\d)\1+/g,'$1')+'000').slice(0,4)}
document.getElementById( "ret" ).innerHTML = [myFunction('Robert'), myFunction('Anna'), myFunction('Bananarama')];
</script>
@pvdz
Copy link

pvdz commented Oct 4, 2011

Without lookup soundex, if I read this properly you're checking if for string s, s[i] is a non-zero digit. If so, couldn't you replace that with s[i]|0 ? Anything non-digit will be 0. Anything digit will be that digit in a number. Saves you a shitload of .charCodeAt bytes :)

On a side note. If you're dead set on passing on data in the function call (as per previous discussion, like in base64), you could require the call to contain the 0 for i, saving you another two bytes. Maybe even r being the first character of s, although I think that's a bit far fetched :)

@p01
Copy link
Author

p01 commented Oct 4, 2011

I made this fork because my approach was radically different from the original GIST but it turned out to be more efficient and golfing this puppy down to 136 bytes was a breeze.

See the original GIST ( fork of the Master 140bytes GIST ) for more details.

In American Soundex, zeroes are only used for padding. Here I am using a lookup mapping the alphabet to their corresponding digit. The characters to be discarded are only mapped to 0 here because it's easy to nuke them using +'0'||''

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment