Damn Cool Algorithms: Levenshtein Automata Posted by Nick Johnson | Filed under python, coding, tech, damn-cool-algorithms In a previous Damn Cool Algorithms post, I talked about BK-trees, a clever indexing structure that makes it possible to search for fuzzy matches on a text string based on Levenshtein distance - or any other metric that obeys the triangle inequality. Today, I'm going to describe an alternative approach, which makes it possible to do fuzzy text search in a regular index: Levenshtein automata. Introduction The basic insight behind Levenshtein automata is that it's possible to construct a Finite state automaton that recognizes exactly the set of strings within a given Levenshtein distance of a target word. Of course, if that were the only benefit of Levenshtein automata, this would be a short article. Construction and evaluation The diagram on the right shows the NFA for a Levenshtein automaton for the word 'food', with maximum edit distance 2. Because this is an NFA, there can be multiple active states. Indexing
Zutopedia Rope (data structure) A simple rope built on the string of "Hello_my_name_is_Simon". In computer programming a rope , or cord , is a data structure for efficiently storing and manipulating a very long string . For example, a text editing program may use a rope to represent the text being edited, so that operations such as insertion, deletion, and random access can be done efficiently. [ 1 ] Description [ edit ] A rope is a binary tree . The binary tree can be seen as several levels of nodes. Operations [ edit ] Index [ edit ] Definition: Index(i) : return the character at position i Time complexity: O(log N) where N is the length of the rope To retrieve the i -th character, we begin a recursive search from the root node: // Note: Assumes 1-based indexing. function index( RopeNode node, integer i) if node.weight < i then return index(node.right, i - node.weight) else if exists(node.left) then return index(node.left, i) else return node.string[i] endif endif end Split [ edit ] Time complexity: O(log N)
Damn Cool Algorithms, Part 1: BK-Trees - Nick's Blog - Vimperator Posted by Nick Johnson | Filed under coding, tech, damn-cool-algorithms This is the first post in (hopefully) a series of posts on Damn Cool Algorithms - essentially, any algorithm I think is really Damn Cool, particularly if it's simple but non-obvious. BK-Trees, or Burkhard-Keller Trees are a tree-based data structure engineered for quickly finding near-matches to a string, for example, as used by a spelling checker, or when doing a 'fuzzy' search for a term. BK-Trees were first proposed by Burkhard and Keller in 1973, in their paper "Some approaches to best match file searching". Before we can define BK-Trees, we need to define a couple of preliminaries. Now we can make a particularly useful observation about the Levenshtein Distance: It forms a Metric Space. These three criteria, basic as they are, are all that's required for something such as the Levenshtein Distance to qualify as a Metric Space. The tree is N-ary and irregular (but generally well-balanced). Previous PostNext Post
Sorting Algorithm Animations Algorithms in Java, Parts 1-4, 3rd edition by Robert Sedgewick. Addison Wesley, 2003. Quicksort is Optimal by Robert Sedgewick and Jon Bentley, Knuthfest, Stanford University, January, 2002. Dual Pivot Quicksort: Code by Discussion. Bubble-sort with Hungarian (“Csángó”) folk dance YouTube video, created at Sapientia University, Tirgu Mures (Marosvásárhely), Romania. Select-sort with Gypsy folk dance YouTube video, created at Sapientia University, Tirgu Mures (Marosvásárhely), Romania. Sorting Out Sorting, Ronald M. President Obama’s Dragnet Those reassurances have never been persuasive — whether on secret warrants to scoop up a news agency’s phone records or secret orders to kill an American suspected of terrorism — especially coming from a president who once promised transparency and accountability. The administration has now lost all credibility. Mr. Based on an article in The Guardian published Wednesday night , we now know the Federal Bureau of Investigation and the National Security Agency used the Patriot Act to obtain a secret warrant to compel Verizon’s business services division to turn over data on every single call that went through its system. A senior administration official quoted in The Times offered the lame observation that the information does not include the name of any caller, as though there would be the slightest difficulty in matching numbers to names. That is a vital goal, but how is it served by collecting everyone’s call data? But what assurance do we have of that, especially since Ms.
Beyond "Soda, Pop, or Coke": Regional Dialect Variation in the Continental US Using data from Bert Vaux's dialect survey , we examine regional dialect variation in the continental United States. Each observation can be thought of as a realization of a categorical random variable with a particular parameter vector that is a function of location—our goal was to interpolate among these points in order to estimate these parameter vectors at a given location, making use of a combination of kernel density estimation and non-parametric smoothing techniques. Results in a smooth field of parameter estimates over the prediction region. Using these results, a method for mapping aggregate dialect distance is developed. Please see the FAQ below. Thank you to everyone who has emailed over the past few days—the response these maps have been getting has been absolutely incredible. Please direct all media requests to Tracey Peake (tracey_peake@ncsu.edu) at the NCSU press office. Right now, the maps only take into account the four most popular answers for a given survey question.
siphon - Siphon SIP -VoIP for iPhone and iPod Touch Home of the World's first free SIP/VoIP application for iPhone and iPod Touch 1 and 2. Siphon SIP/VoIP project is the first in his category that works on iPhone and iPod Touch 2 with headset for all SIP providers. It is a native application approved running on 2.X using internal micro/speaker and headset. The Application supports the SIP standard, preserving compatibility with hundreds of SIP providers and offers a GUI which preserves the apple design of native iPhone applications. Be careful, this version didn't test on iPod Touch 1. One thing is sure, Touchmod's micro doesn't work with iPhone 2.X OS. Currently, Siphon is localized in 15 languages. Screenshots Call Screenshots Settings Screenshots The parameters for several SIP providers are described on this page. Misc Home of the World's first free SIP/VoIP application for iPhone and iPod Touch 1.X.Y Siphon SIP/VoIP project is the first in his category that works both on iPhone and iPod Touch. For iPhone 1.X.Y Contacts thanks to Metabaron.
Download Nyquist Plug-ins Plug-in authors These plug-ins have been contributed by: Steve Daulton Edgar Franke Steven Jones David R.Sky Jvo Studer Installing Plug-ins Nyquist plug-ins may be available as plain text files with the file extension .NY , or as a ZIP archive file. To install Nyquist plug-ins, place the NY file in the Plug-Ins folder inside the Audacity installation folder. On Windows computers, this is usually at C:\Program Files\Audacity ( or C:\Program Files\Audacity 1.3 Beta (Unicode) in legacy 1.3 versions ) . Restart Audacity, then the plug-ins will appear underneath the divider in the "Effect", "Generate" or "Analyze" menus. Plug-in Lists In the lists below, click a "Downloads" link to go to the page for Generate, Effect or Analyze plug-ins, or click the individual links to go directly to the description of that plug-in. Left-clicking the View link displays the plug-in code in your browser. Generate Plug-in Downloads These plug-ins usually appear towards the bottom of the Generate menu in Audacity.