Google Code Search received a small update: Google indexes snippets from web pages and individual code files, not only repositories and source code archives. You can also submit links to archives, CVS or Subversion repositories to be included in Google's index.
The ranking algorithms now favor class and function definitions.
A good alternative to Google Code Search is Krugle (read about its history), which has a less comprehensive index and doesn't support regular expressions for queries, but compensates with a great AJAX interface, tabs that allow you to open more search results in the same page, syntax highlighting and an easy way to browse within a project directory. Krugle also has a search engine for tech articles, documentations and books and it powers SourceForge's search engine. One thing I especially like about Krugle is you can search on specific code features such as function calls, function definitions, class definitions and comments.
Another popular search engine for code is Koders, which offers very good enterprise solutions, like integration with IDEs and version control systems. The index is less comprehensive than for Krugle and Google Code Search, and it returns worse results than the other search engines, but, according to Alexa, Koders is more popular than Krugle.
It would be nice to have a source code search engine that has Google's index and powerful queries, combined with Krugle's elegant interface and its attention to details and the business-oriented features from Koders.
I also like Krugle's ajax interface over Google's usual simple UI for code search. (I haven't used Koders before) Also, Krugle's UI provides more context to the current code search result. A more advanced UI really works here since the target users of the tool are fairly advanced users (developers). My only gripe with Krugle is it's lack of proper behavior support for the browser's back button. Though the tabbed interface somehow tries to make up for that fact, I still think that every good AJAX UI should get this right.
ReplyDeleteAnother thing that I like about Krugle is its syntax coloring.
Hi there,
ReplyDeleteThanks for the nice write-up. I started typing "a few quick comments" here, but when it stretched into multiple paragraphs I turned it into a blog post at Code Search Mashup?
The 5-cent overview...
1. We're now in beta with our enterprise product.
2. Love to hear how you compare code index sizes.
3. What do you typically use regular expressions to search for?
Thanks,
-- Ken
@GT Staff - Re the back-button. Amen to that. We've fixed it in the latest revision of our enterprise product, which will get propagated to our public site in the next update.
ReplyDeleteCongratulations for your great site, Mr. Krugler.
ReplyDeleteFor comparing the index size, I used some queries that should return a big number of results (int - Krugle:7,687,992 vs Google:38,700,000) and measurable queries ("pca transformation" - Krugle:2 vs Google:20). Google even groups identical files.
I use regular expressions to search for different versions of a file name or to include synonyms in a query. It's also nice that you can control the search results better: get all the files that include "int md5(" at the beginning of a line, get the files that contain recursive in a function definition.
Koders is now in beta with a Pro edition which is much like our Enterprise Edition, but for small teams and individual developers.
ReplyDeleteHowever, it is every bit as powerful as the Enterprise edition.
I'd love to hear your criticisms of our code search results. I definitely see the room for improvement. Thanks!
Let's say I want a Python implementation of a function that calculates the edit distance between two strings.
ReplyDeleteQuery: edit distance
Language: Python
Koders (97 results) - the first 25 results are pretty bad. The first relevant result is #11 (smartmsgmerge.py) that has an implementation of the algorithm.
Krugle (228 results) - the result #11 from Koders is at #3 in Krugle. Overall, Krugle returns much better results.
Google Code Search (3,000 results) - the first result is OK. Also the results #4, #5, #6, #7, #8 (some of them are identical).
It's just a simple test (other test give similar results), but if A>B is defined as "A returns better results than B",
Google Code Search > Krugle > Koders.
Hi Alex, we just updated our index which fixes a slight bug we had in how results were ranked. Let me know if the results are ordered better.
ReplyDeleteUnfortunately, we only return 110 results. Most like we need to index more Python projects.
Sorry for the long delay in responding. I finally got around to blogging about the issue of index size today, see Index size, regular expressions and code search.
ReplyDeleteIt has results from the test queries you mentioned that you'd used to estimate index size, which I think you'll find interesting.
Also, re your follow-up comment about searching for an edit distance formula...if you look at the hits near the end (on all 3 sites) they aren't very good. Using "edit distance" gives you higher quality results, which are also a bit more useful for comparison.
Similar shifts happen when you use the more explicit "Levenshtein" term, which is in line with what Matt Cutts suggested doing to get a sense of index size (use a rare term).
Finally - thanks! It's great that your post has sparked this discussion.
Another alternative to Google Code Search is SymbolHound. http://www.symbolhound.com/codesearch .
ReplyDeleteYou may search open source code repo, including sort by language and other features.