Jump to: navigation, search

Platform-releng/How to check integrity of downloads

Verifying integrity of downloads from the Eclipse Platform Project and Equinox

First and foremost, the majority of users should see, understand, and follow the instructions at How to check the integrity of downloads from the Eclipse Foundation. Those instructions cover the majority of cases and is the best instructions for all general users.

But, if you are a committer, release engineer, or even a power user, some of the following information might be important to know. (The following information primarily applies to the Neon (4.6) release, but some applies to older releases too.)

Downloadable artifacts specifically from the Eclipse or Equinox Projects have an associated file named the same as the artifact file but ending with ".sha512" or ".sha256" and these checksum files are in a directory named 'checksum'. We recommend these files for programmatic verification as they are too long to visually verify. We recommend the SHA512 checksums be used and provide the SHA256 checksums simply because other sites provide them so some users may already have automated methods in place to make use of those.

An example using 'http'

As an example, you can download the following artifact, say using wget

 wget -O eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz  http://download.eclipse.org/eclipse/downloads/drops4/R-4.5.2-201602121500/eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz 

then you would also want to download the corresponding ".sha512" file to the same location.

  wget -O eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz.sha515 http://download.eclipse.org/eclipse/downloads/drops4/R-4.5.2-201602121500/checksum/eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz.sha512

Then, to verify the integrity of the downloaded artifact you would run

  sha512sum eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz.sha515

If the artifact downloaded correctly, you would see a response from above command of

   eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz: OK

Limitations using unencrypted connections and how to improve

The "download server" does not have "https://" protocol available since it would be expensive to encrypt all the large artifacts from it (See bug 435426). The problem with using "http://" is that in general you can not count on the authenticity of the connection. While it would be extremely rare, and has never been reported for "eclipse.org", in principle, someone could manage to stage a "man in the middle" attack and deliver some sort of tainted artifact and likewise deliver their own checksum which would match the tainted artifact. [Note: as been said many times, security is always a matter of degree. It is still a good idea to always "check integrity" especially, for example, if downloading the artifact from a mirror, where it might be easier to taint the artifacts, and then get the checksums from 'download.eclipse.org' which would require a more sophisticated manipulation to "fool" you or your network.]

Release engineers, building on "build.eclipse.org", can get direct file access to the artifacts and checksums and then the checksum check is secure (authentic) and can be trusted to simply confirm the copy was done intact and the artifact has not been changed since it was built.

Committers can get the artifact using unencrypted methods, and then get the checksum using 'scp', or rsync over an SSH connection, to be sure of the authenticity of their connection and hence the checksum. For example,

 scp <committer_id>@build.eclipse.org://home/data/httpd/download.eclipse.org/eclipse/downloads/drops4/R-4.5.2-201602121500/checksum/eclipse-SDK-4.5.2-linux-gtk-x86_64.tar.gz.sha512  .

Making use of the signed files of checksums

Beginning with Neon, in addition to the individual "*.sha512" and "*.sha256" files we also make all checksums available in a single text file. These are linked from the download page for each build of Eclipse and Equinox. The link is named similar to "SHA515 Checksums" and points to a file that is named similar to

<eclipse-download-URL>/checksum/eclipse-<buildId>-SHASUMS512

or for Equinox

<equinox-download-URL>/checksum/equinox-<buildId>-SHASUMS512 

In addition -- and this is the reason for making them all available in one file -- there is a matching file that is named the same as above, but ending with ".asc". This is a GPG detached signature file that can be used to confirm the integrity and authenticity of the checksum file.

The idea is that once the validity and authenticity of the plain-text SHASUMS file is done then the checksums in the "checksum file" can be trusted to be authentic and valid checksums for the files it lists.

Example of using GPG with the checksums files

This example is primarily based on the Linux command line, where the GPG tools are part of most distributions, but there are similar tools available for all other platforms, such as Windows and MacOSX. There are also many UI tools available to make some things easier and may be variations in options or command depending on exactly what you have installed. [Note to readers: if you know or learn significant tips or tricks, please update this document.]

This example makes things look harder than they are, simply because it goes through all the steps for a "first time user". After doing it once, subsequent use will be much easier.

The basic command to verify the *-SUMSSHA512 file, after downloading both it and its ".asc" counterpart, is, to pick a concrete example,

gpg --verify eclipse-I20160518-2000-SUMSSHA512.asc eclipse-I20160518-2000-SUMSSHA512 

or, as a slight shorthand, you can omit the second file if its name is the same as the "*.asc" file (minus the ".asc" extension). To be explicit, the following is equivalent to the above command.

gpg --verify eclipse-I20160518-2000-SUMSSHA512.asc

If this is the first time you have tried to verify (i.e. you do not yet have the eclipse-dev public key) you would receive a message such as

gpg: Signature made Mon 16 May 2016 10:11:24 PM EDT using RSA key ID 9E48E229
gpg: Can't check signature: public key not found

This means you must import the public key with id 9E48E229 into your keyring. One way of doing that is with the command

gpg --recv-keys 9E48E229 --keyserver pgp.mit.edu

Just about any well-known keyserver would work, since they communicate with each other to replicate their databases. Many UI programs make this easier, but some (such as the one distributed with Ubuntu) require "0x" to be pre-pended to the key.

Now, after "receiving the key" when you run the command,

 gpg --verify eclipse-I20160518-2000-SUMSSHA512.asc eclipse-I20160518-2000-SUMSSHA512

You will get a response similar to the following:

 gpg: Signature made Mon 16 May 2016 10:11:24 PM EDT using RSA key ID 9E48E229
 gpg: Good signature from "Eclipse Project <eclipse-dev@eclipse.org>"
 gpg: WARNING: This key is not certified with a trusted signature!
 gpg:          There is no indication that the signature belongs to the owner.
 Primary key fingerprint: 869F F7E3 1C98 FBCF CF16  7CDE 01D8 1CA5 60A4 8EFD
      Subkey fingerprint: F7B8 1473 283E CB71 19A7  473A BDF4 7870 9E48 E229

That response above tells you the checksums file has not been tampered with since it was signed. If the file had been changed since it was signed you would get a response similar to the following:

 gpg: Signature made Mon 16 May 2016 10:11:24 PM EDT using RSA key ID 9E48E229
 gpg: BAD signature from "Eclipse Project <eclipse-dev@eclipse.org>"

But, what about that "WARNING: This key is not certified with a trusted signature"? In this context (namely, the "first time through" this procedure) that message does not mean anything bad. It simply means you, the user, have not assigned a "trust level" to it. (It would be worth investigation if you had already "trusted" the key once, and then in the future received the warning.)

The mechanics of assigning trust is easy enough (again, many UI programs make this even easier). But from the command line, the command would be similar to

 gpg --edit-key 9E48E229 trust

This will then enter interactive mode, and offer choices such as

 1 = I don't know or won't say
 2 = I do NOT trust
 3 = I trust marginally
 4 = I trust fully
 5 = I trust ultimately
 m = back to the main menu

But, which to choose? This "human part" of trust is both a hard part of GPG, but also one of its strengths. To avoid any warning with the '--verify' command, the trust must be "ultimate" (or, other conditions in your "web of trust"). In an ideal world, you would trust the signature of someone you had met face-to-face, verified their identity, and then received their "fingerprint" from something like their business card. But, that is impractical for most users and some would say overkill for simply verifying a download. Another "pretty good" alternative is to verify the fingerprint from an independent, "safe" source, such as a web page, with an "https" connection. Accordingly, the fingerprint for "eclipse-dev" and "equinox-dev" are currently published on my (David Williams) profile page at Eclipse.org.

Note: the procedure above is identical for Equinox except the equinox-dev signing key has an ID of 470B675A whereas the Eclipse project uses the eclipse-dev signing key with an ID of 9E48E229.

Summmary of example

After being through the full procedure above, the day-to-day use of verifying the checksums files is simply a matter of downloading the two files, and then calling

  gpg --verify eclipse-${BUILD_ID}-SUMSSHA512.asc

Verifying the artifacts with the checksums

The purpose of doing the above verification of the checksums files is to know you can trust the checksums to verify the artifacts you download from Eclipse or Equinox. The checksums files contain all the checksums for all possible artifacts from a download site. That is a lot more artifacts than you typically need or want to verify. If you used the traditional "verify checksums" command, something similar to the following:

 sha512sum --check eclipse-${BUILD_ID}-SUMSSHA512 

then there would be a lot of warnings about not being able to read some of the files -- files you never downloaded to begin with. That is why the output of the traditional command is typically filtered with a command such as grep. For example,

sha512sum --check eclipse-I20160518-2000-SUMSSHA512 2>&1 | grep OK

or, if you just had one file to check, you could use something similar to

sha512sum --check eclipse-I20160518-2000-SUMSSHA512 2>&1 | grep eclipse-SDK-I20160518-2000-linux-gtk-x86_64.tar.gz

Why go through all this? The advantage of using the file of checksums is that it can be better verified to be authentic even from an 'http' connection or even from an unknown mirror by verifying its signature before using the checksums to verify the artifacts. Not everyone is that concerned with security -- but, in this day and age, you should be!