There are two basic components to the query syntax of Search CVS, a phrase (consisting of one or more words), and a set of zero or more parameters. These both need a little more explanation to understand their full power.
Search CVS searches through the messages of each commit, looking for matching words in commit messages. Only the message field is matched against the phrase, if you would like to narrow down the search further, use one or more parameters (see below). The syntax for the phrase is simply a set of one or more words.
Matching the phrase is done using MySQL's fulltext indexing. This splits the phrase into words and then searches for occurances of each word, assigning a relevance value based on the number of matches. Once a set of results is found, the results are sorted according to the relevance value and then in chronological order.
MySQL's fulltext indexing is relatively sophisticated, but most of the details aren't essential to know, with one exception. That exception is the fact that MySQL's fulltext indexing does not index words that are shorter than a set number of characters (4 by default), thus if you attempt to search for a word that is shorter than that set number of characters, you won't get any results. In other words, using the default settings, both of these queries:
would return the same results, and a query of
would return no results.
There is one other option for the phrase, and that is a quoted phrase. If you search for
only results with exactly the word 'hello' immediately followed by a space and then 'world' will be returned (technical note: this kicks the entire match into boolean mode), whereas searching for
would return results with both words as well as just one of the words. When searching for quoted phrases, the relevance value becomes a boolean and only indicates if a match was found or not, saying nothing about the quality of the match. Effectively, this means that the results will only be sorted chronologically for quoted phrases. Also note that the minimum character length also applies to quoted phrases, such that
will return no results, but
will return the expected results.
There is one exception to the above, if you enter only a bug number (with or without square braces, but nothing else), you will only be presented with the changes associated with that bug.
Parameters can be used to further narrow down the set of results, their syntax consists of a keyword followed by a colon, followed by the search term (with an optional space between the colon and search term), for example:
Different keywords can be combined provided that they are separated by whitespace. For example:
days: 200 author: emerks
will find all commits by 'emerks' within the last 200 days.
In the case of duplicate parameters, only the last parameter will be used. For example:
author: emerks author: nickb
will only search for commits made by 'nickb'.
With the exception of the days and project/module keywords, the search terms are treated as substrings, meaning that a query of
will find all commits by 'emerks' as well as anyone else with 'merks' in their committer name.
Following is a complete list of search parameters, including an example and brief description:
|author||author: merks||restrict search to only this author (will match partial names)|
|branch||branch: R2_1_||restrict search to commits from this branch (will match partial names)|
|days||days: 7||restrict search to commits made in the last N days|
|startdate, enddate||startdate: 2006-09-21 12:40:00||restrict search to commits made before or after a date/time (inclusive of limit)|
|startdate & enddate|| startdate: 2006-09-20 00:00:00
enddate: 2006-09-20 23:59:59
|restrict search to commits made between two dates (inclusive of limits)|
|file||file: org.eclipse.emf/||restrict search to files matching this substring|
|project or module||project: org.eclipse.xsd||restrict search to commits in this CVS module (or project)|
Following is a complete list of querystring parameters, including an example and brief description:
|q||q=project:+org.eclipse.emf+days:7||REQUIRED||standard querystring field used for search parameters|
|totalonly||q=project:+org.eclipse.emf+days:7&totalonly||OPTIONAL||if set, display only a count of the # of deltas found; overrides showbuglist|
|showbuglist||q=project:+org.eclipse.emf+days:7&showbuglist||OPTIONAL||if set, display csv list of bugs found|
|bugfilter||q=project:+org.eclipse.emf&bugfilter=nobug||OPTIONAL||if set, filter results based on presence/absence of a bugzilla # associated with each commit; values: "hasbug" or "nobug"|
|fullpath||q=author:nickb+days:7&fullpath||OPTIONAL||if set, show full cvs path of files rather than just the filename; handy for committers who work across projects|
At least one word or parameter must be entered to perform a search, otherwise there are no arbitrary restrictions or limits.
Implementing Search CVS for your website is fairly straightforward, requiring both a front end and a back end to be properly connected and secured.
Search CVS requires MySQL 5.0 and PHP 4.3 (or higher) with mysql and pcre support.
- Check out the Search CVS code from
:pserver:firstname.lastname@example.org:/cvsroot/eclipse, module org.eclipse.releng.basebuilder/plugins/org.eclipse.build.tools/scripts_cvs/searchcvs from the releng_test tag (it is not present in HEAD as of this writing).
cd $HOME; cvs -d :pserver:email@example.com:/cvsroot/eclipse -q ex -r releng_test -d searchcvs \ org.eclipse.releng.basebuilder/plugins/org.eclipse.build.tools/scripts_cvs/searchcvs
- In the directory you checked out the parser code, you should find a
cvssrc/directory. Check out all projects you would like to index into this directory. Customize
setup.shto check your project(s) out into this directory, then run it.
cd $HOME/searchcvs; ./setup.sh
- Configure the two MySQL dump files. You need to change the usernames and passwords as well as (optionally) the database name.
- Add the read-write user's credentials to
- If you renamed the database, ensure you change its name in
mysql -u root -p <mysql-modelingschema.dump mysql -u root -p <mysql-users.dump
- Now that the database is ready, you can run
parsecvs.shwhich will populate the database using the CVS logs of the checked out projects.
parsecvs.shexpects be in the same directory as
- If you get odd PHP notices or warnings, script timeouts, or memory exhausted errors, you'll have to customize your
/etc/php.inifile. Add or change the following:
;;;;;;;;;;;;;;;;;;; ; Resource Limits ; ;;;;;;;;;;;;;;;;;;; max_execution_time = 240; Maximum execution time of each script, in seconds (default 30) memory_limit = 128M ; Maximum amount of memory a script may consume (default 8MB)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Error handling and logging ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; error_reporting = E_ALL & ~E_NOTICE
- To keep your database current, you should run
parsecvs.shas a nightly cron job, eg., at 10pm every night:
00 22 * * * $HOME/searchcvs/parsecvs.sh &> $HOME/searchcvs/parsecvs.log
- Finally, you'll likely want to secure your server so that only limited access is available. Create a firewall restricting access to MySQL to only authorized servers:
# Flush the INPUT chain /sbin/iptables -F INPUT /sbin/iptables -P INPUT ACCEPT # Flush FORWARD chain /sbin/iptables -F FORWARD /sbin/iptables -P FORWARD DROP
# Drop all mysql connections /sbin/iptables -I INPUT -p tcp --dport 3306 -j REJECT /sbin/iptables -I INPUT -p tcp --dport 3306 -j LOG # Accept from localhost /sbin/iptables -I INPUT -p tcp -s 127.0.0.1/24 --dport 3306 -j ACCEPT # Accept from *.eclipse.org /sbin/iptables -I INPUT -p tcp -s 22.214.171.124/27 --dport 3306 -j ACCEPT
- You can test whether the firewall is working by running:
telnet servername 3306 -or- telnet localhost 3306
Web Server (www.eclipse.org)
- The latest copy of the web interface can be found in
:pserver:firstname.lastname@example.org:/cvsroot/org.eclipse, module www/modeling. You will also see a copy of these files in the
searchcvs/www/folder you checked out above. Place the following files into your project's web directory, and customize them as needed for your project:
searchcvs.php includes/searchcvs-common.php includes/searchcvs.css includes/db.php
- Additionally, there is a non-public file called
includes/db.php). Edit this file as you did for
parsecvs-dbacccess.phpabove, this time adding the read-only user. To put this file on the www.eclipse.org server without it being available in CVS for public view, contact webmaster(at)eclipse.org or see details here.
- You might want to add a .cvsignore file to avoid accidentally committing this file to CVS.
You should now be able to use the Search CVS application on your website for your project(s). If you encounter any access problems between our mysql server and www.eclipse.org, see bug 156451 for possible reasons.
Could Search CVS work with Subversion (SVN) ? Well, yes and no. Details here.