many web developers use database for search. does it scale?. I dont think so. every tried to search using LIKE or CONTAINS operators in database. did it return results which are closer to search string .
when you search for RAM , you may get RAMARAO, RAM EDARA, RAMGOPAL VERMA, VenkataRam etc.. bunch of results... but which result is closer to your search string. did you ever cared about that!!.
many use sphinix or apache lucene..did you ever try to write your own search algorithm !!
ever wondered how google search works...when your search has many words?.
1) gets list of document id's for each word u searched and will do intersection of doc ids.
2) uses some algorithms which relate ur search to context of search
3) will use ranking algorithms based on number of times string appeared on web page or number of hits on the page for that search string
4) displays results to you.
I am developing a mini text search algorithm which would be useful for many websites to search their users based on email,firstname, last name etc.
I am going to use python dictionaries and also edit distance algorithm, this may take little bit of memory for search, but it takes just O(1) to retrieve the user details...results will be based on how string is closer to your search string. to save memory we can also use trie data structure with some modifications , but it would take O(k) time where k is length of your search string.
watch this space for more info on my search algorithm. I would compare python dictionary vs python trie for searching...
when you search for RAM , you may get RAMARAO, RAM EDARA, RAMGOPAL VERMA, VenkataRam etc.. bunch of results... but which result is closer to your search string. did you ever cared about that!!.
many use sphinix or apache lucene..did you ever try to write your own search algorithm !!
ever wondered how google search works...when your search has many words?.
1) gets list of document id's for each word u searched and will do intersection of doc ids.
2) uses some algorithms which relate ur search to context of search
3) will use ranking algorithms based on number of times string appeared on web page or number of hits on the page for that search string
4) displays results to you.
I am developing a mini text search algorithm which would be useful for many websites to search their users based on email,firstname, last name etc.
I am going to use python dictionaries and also edit distance algorithm, this may take little bit of memory for search, but it takes just O(1) to retrieve the user details...results will be based on how string is closer to your search string. to save memory we can also use trie data structure with some modifications , but it would take O(k) time where k is length of your search string.
watch this space for more info on my search algorithm. I would compare python dictionary vs python trie for searching...
No comments:
Post a Comment