Jump to: navigation, search

Search CVS

User Guide

There are two basic components to the query syntax of Search CVS, a phrase (consisting of one or more words), and a set of zero or more parameters. These both need a little more explanation to understand their full power.

The Phrase

Search CVS searches through the messages of each commit, looking for matching words in commit messages. Only the message field is matched against the phrase, if you would like to narrow down the search further, use one or more parameters (see below). The syntax for the phrase is simply a set of one or more words.

Matching the phrase is done using MySQL's fulltext indexing. This splits the phrase into words and then searches for occurances of each word, assigning a relevance value based on the number of matches. Once a set of results is found, the results are sorted according to the relevance value and then in chronological order.

MySQL's fulltext indexing is relatively sophisticated, but most of the details aren't essential to know, with one exception. That exception is the fact that MySQL's fulltext indexing does not index words that are shorter than a set number of characters (4 by default), thus if you attempt to search for a word that is shorter than that set number of characters, you won't get any results. In other words, using the default settings, both of these queries:

hello emf

and

hello

would return the same results, and a query of

emf

would return no results.

There is one other option for the phrase, and that is a quoted phrase. If you search for

"hello world"

only results with exactly the word 'hello' immediately followed by a space and then 'world' will be returned (technical note: this kicks the entire match into boolean mode), whereas searching for

hello world

would return results with both words as well as just one of the words. When searching for quoted phrases, the relevance value becomes a boolean and only indicates if a match was found or not, saying nothing about the quality of the match. Effectively, this means that the results will only be sorted chronologically for quoted phrases. Also note that the minimum character length also applies to quoted phrases, such that

"emf emf"

will return no results, but

"hello emf"

will return the expected results.

There is one exception to the above, if you enter only a bug number (with or without square braces, but nothing else), you will only be presented with the changes associated with that bug.

The Parameters

Parameters can be used to further narrow down the set of results, their syntax consists of a keyword followed by a colon, followed by the search term (with an optional space between the colon and search term), for example:

author: emerks

Different keywords can be combined provided that they are separated by whitespace. For example:

days: 200 author: emerks

will find all commits by 'emerks' within the last 200 days.

In the case of duplicate parameters, only the last parameter will be used. For example:

author: emerks author: nickb

will only search for commits made by 'nickb'.

With the exception of the days and project/module keywords, the search terms are treated as substrings, meaning that a query of

author: merks

will find all commits by 'emerks' as well as anyone else with 'merks' in their committer name.

Parameter List

Following is a complete list of search parameters, including an example and brief description:

Parameter Example Description
author author: merks restrict search to only this author (will match partial names)
branch branch: R2_1_ restrict search to commits from this branch (will match partial names)
days days: 7 restrict search to commits made in the last N days
startdate, enddate startdate: 2006-09-21 12:40:00

enddate: 2001-05-23 00:00:00

restrict search to commits made before or after a date/time (inclusive of limit)
startdate & enddate startdate: 2006-09-20 00:00:00
enddate: 2006-09-20 23:59:59
restrict search to commits made between two dates (inclusive of limits)
file file: org.eclipse.emf/ restrict search to files matching this substring
project or module project: org.eclipse.xsd

module: org.eclipse.xsd

restrict search to commits in this CVS module (or project)
bugid bugid: 191173 restrict search to commits tagged with the given bug id

Querystring Options

Following is a complete list of querystring parameters, including an example and brief description:

Parameter Example Description
q q=project:+org.eclipse.emf+days:7 REQUIRED standard querystring field used for search parameters
totalonly q=project:+org.eclipse.emf+days:7&totalonly OPTIONAL if set, display only a count of the # of deltas found; overrides showbuglist
showbuglist q=project:+org.eclipse.emf+days:7&showbuglist OPTIONAL if set, display csv list of bugs found
bugfilter q=project:+org.eclipse.emf&bugfilter=nobug OPTIONAL if set, filter results based on presence/absence of a bugzilla # associated with each commit; values: "hasbug" or "nobug"
fullpath q=author:nickb+days:7&fullpath OPTIONAL if set, show full cvs path of files rather than just the filename; handy for committers who work across projects

Restrictions

At least one word or parameter must be entered to perform a search, otherwise there are no arbitrary restrictions or limits.

Setup

The most important part of Search CVS's Bugzilla-to-CVS integration is one simple workflow change that everyone can do, and many are already doing.

Beyond that, implementing Search CVS for your website is fairly straightforward, requiring both a front end and a back end to be properly connected and secured.

Workflow

To identify that a given commit is related to a given bug, developers need to label the commit with the bug number. Numerous formats are supported (bug 153838, bug 164719), but the recommended one is to simply use:

[164719]
  • If you use Mylyn, set Window > Preferences... > Mylyn > Team > Commit Comment Template to
[${task.id}]
  • If not, just ensure that every commit that is associated with a bug has a comment which begins with a bug number in square braces, and an optional comment thereafter.
  • Should a commit be relevant for more than one bug, simply enter one bug number & optional comment per line of commit comment.

Database Server

Search CVS requires MySQL 5.0 and PHP 4.3 (or higher) with mysql and pcre support.

cd $HOME; cvs -d :pserver:anonymous@dev.eclipse.org:/cvsroot/eclipse -q ex -r releng_test -d searchcvs \
  org.eclipse.releng.basebuilder/plugins/org.eclipse.build.tools/scripts_cvs/searchcvs
  • In the directory you checked out the parser code, you should find a cvssrc/ directory. Check out all projects you would like to index into this directory. Customize setup.sh to check your project(s) out into this directory, then run it.
cd $HOME/searchcvs; ./setup.sh
  • Configure the two MySQL dump files. You need to change the usernames and passwords as well as (optionally) the database name.


  • Add the read-write user's credentials to parsecvs-dbaccess.php.


  • If you renamed the database, ensure you change its name in parsecvs.php too.


mysql -u root -p <mysql-modelingschema.dump
mysql -u root -p <mysql-users.dump
mysql -u root -p <mysql-releases.dump
  • Now that the database is ready, you can run parsecvs.sh which will populate the database using the CVS logs of the checked out projects. parsecvs.sh expects be in the same directory as parsecvs.php and cvssrc/.
$HOME/searchcvs/parsecvs.sh
  • If you get odd PHP notices or warnings, script timeouts, or memory exhausted errors, you'll have to customize your /etc/php.ini file. Add or change the following:
;;;;;;;;;;;;;;;;;;;
; Resource Limits ;
;;;;;;;;;;;;;;;;;;;

max_execution_time = 240; Maximum execution time of each script, in seconds (default 30)
memory_limit = 128M     ; Maximum amount of memory a script may consume (default 8MB)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Error handling and logging ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

error_reporting = E_ALL & ~E_NOTICE
  • To keep your database current, you should run parsecvs.sh as a nightly cron job, eg., at 10pm every night:
00 22 * * * $HOME/searchcvs/parsecvs.sh &> $HOME/searchcvs/parsecvs.log
  • Finally, you'll likely want to secure your server so that only limited access is available. Create a firewall restricting access to MySQL to only authorized servers:
# Flush the INPUT chain
/sbin/iptables -F INPUT
/sbin/iptables -P INPUT ACCEPT

# Flush FORWARD chain
/sbin/iptables -F FORWARD
/sbin/iptables -P FORWARD DROP
# Drop all mysql connections
/sbin/iptables -I INPUT -p tcp --dport 3306 -j REJECT
/sbin/iptables -I INPUT -p tcp --dport 3306 -j LOG

# Accept from localhost
/sbin/iptables -I INPUT -p tcp -s 127.0.0.1/24 --dport 3306 -j ACCEPT

# Accept from *.eclipse.org
/sbin/iptables -I INPUT -p tcp -s 206.191.52.32/27 --dport 3306 -j ACCEPT
  • You can test whether the firewall is working by running:
telnet servername 3306
 -or-
telnet localhost 3306

Web Server (www.eclipse.org)

  • The latest copy of the web interface can be found in :pserver:anonymous@dev.eclipse.org:/cvsroot/org.eclipse, module www/modeling. You will also see a copy of these files in the searchcvs/www/ folder you checked out above. Place the following files into your project's web directory, and customize them as needed for your project:
searchcvs.php
includes/searchcvs-common.php
includes/searchcvs.css
includes/db.php
  • Additionally, there is a non-public file called includes/searchcvs-dbaccess.php (referenced by includes/db.php). Edit this file as you did for parsecvs-dbacccess.php above, this time adding the read-only user. To put this file on the www.eclipse.org server without it being available in CVS for public view, contact webmaster(at)eclipse.org or see details here.


  • You might want to add a .cvsignore file to avoid accidentally committing this file to CVS.


You should now be able to use the Search CVS application on your website for your project(s). If you encounter any access problems between our mysql server and www.eclipse.org, see bug 156451 for possible reasons.

Subversion

Could Search CVS work with Subversion (SVN) ? Well, yes and no. Details here.

See Also