The open source clustering software available here implement the most commonly used clustering methods for gene expression data analysis. The clustering methods can be used in several ways. Cluster 3.0 provides a Graphical User Interface to access to the clustering routines. It is available for Windows, Mac OS X, and Linux/Unix. Python users can access the clustering routines by using Pycluster, which is an extension module to Python. People that want to make use of the clustering algorithms in their own C, C++, or Fortran programs can download the source code of the C Clustering Library.

 

Cluster 3.0 is an enhanced version of Cluster, which was originally developed by Michael Eisen while at Stanford University. Cluster 3.0 was built for the Microsoft Windows platform, and later ported to Mac OS X (Cocoa build for Mac OS X v10.0 or later) and to Linux/Unix using Motif. In addition to the GUI program, Cluster 3.0 can also be run as a command line program. For more information, please consult the online manual.
Installation:
For Microsoft Windows and Mac OS X, use the appropriate installer. The Cluster 3.0 executables cluster.com (on Windows) or Cluster (on Mac OS X) can be used both as a GUI program and as a command line program.
For Cluster 3.0 on Linux/Unix, you will need the Motif libraries, which are already installed on many Linux/Unix computers. You will need a version compliant with Motif 2.1, such as OpenMotif. Cluster 3.0 can then be installed by typing
./configure
make
make install
The resulting executable cluster can be run as a GUI program and as a command line program. For the latter, you will need to use the appropriate command line options. If you are not interested in the GUI, and you want to run Cluster 3.0 as a command line program only, you can install a command-line only version of Cluster by typing
./configure --without-x
make
make install
If you install Cluster 3.0 as a command-line only program you do not need the Motif libraries.
Download (last update August 30, 2019; C Clustering Library version 1.59):
Installer for Microsoft Windows;
Installer for Mac OS X. You may need to remove /Library/Receipts/Cluster.pkg if you have an older version of Cluster 3.0 installed. If you get the error message '"Cluster.pkg" can't be opened because it is from an unidentified developer.', right-click on Cluster.pkg after downloading, and select "Open With" → "Installer".
Linux/Unix source code;
manual in PDF format.

Java TreeView
To view the clustering results generated by Cluster 3.0, we recommend using Alok Saldanha's Java TreeView, which can display hierarchical as well as k-means clustering results. Java TreeView is not part of the Open Source Clustering Software.

 

Python is a scripting language with excellent support for numerical work through the Numerical Python package, providing a functionality similar to Matlab and R. This makes Python together with Numerical Python an ideal tool for analyzing genome-wide expression data. Pycluster now uses the "new" Numerical Python (version 1.3 or later).
Python can be easily integrated with C and other low-level languages, thus combining the speed of C with the flexibility of Python.
The routines available in Pycluster are described in the manual to the C Clustering Library . To install Pycluster, download the Pycluster source distribution, unpack, change to the directory Pycluster-1.59, and type python setup.py install as usual. If you use Python under Windows, we recommend using the Windows installer instead, which is available here for Python 2.7, 3.5, 3.6, and 3.7.
Download:
Pycluster source distribution; Windows installer for Python 2.7 (32 bits); Windows installer for Python 2.7 (64 bits); Windows installer for Python 3.5 (32 bits); Windows installer for Python 3.5 (64 bits); Windows installer for Python 3.6 (32 bits); Windows installer for Python 3.6 (64 bits); Windows installer for Python 3.7 (32 bits); Windows installer for Python 3.7 (64 bits); manual in PDF format.

 

Algorithm::Cluster, written by John Nolan of the University of California, Santa Cruz, is a Perl module that makes use of the C Clustering Library. Some example Perl scripts are available in the perl/examples directory in the source distribution. To install Algorithm::Cluster, download the source code, unpack, and type perl Makefile.PL, followed by make to compile the code, make test to run the test scripts, and make install to install the Algorithm::Cluster module.
On Windows, we recommend using Strawberry Perl, which includes a compiler, allowing you to compile and install Algorithm::Cluster on Windows.
Download: Algorithm::Cluster source distribution; manual in PDF format.

 

The routines in the C clustering library can be included in or linked to other C programs (this is how we built Cluster 3.0). To use the C clustering library, simply collect the relevant source files from the source code distribution. As of version 1.04, the C clustering library complies with the ANSI C standard.


Download: source code; manual in PDF format.

 

License

The C clustering library and Pycluster were released under the Python License. Algorithm::Cluster was released under the Artistic License. The GUI-codes Cluster 3.0 for Windows, Mac OS X, and Linux/Unix, as well as the command line version of Cluster 3.0 are still covered by the original Cluster/TreeView license.

 

Acknowledgment

We would like to thank Michael Eisen of Berkeley Lab for making the source code of Cluster/TreeView 2.11 available. Without this source code, it would have been much harder to develop Cluster 3.0.

HOME