Skip to content

Commit 6d72b35

Browse files
author
Daniel Mapleson
committed
Finish Release-2.2.0
Improved README and documentation Better checking of python dependencies Enforcing static linking of kat_jellyfish and kat libraries to executable Better checking of sequence files, now kat can detect fasta and fastq files for kmer counting, even without a known extension.
2 parents afc287a + 774feeb commit 6d72b35

22 files changed

+344
-174
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,3 +39,4 @@ Makefile.in
3939
*.Makefile.am.swp
4040
*.pc
4141
*.la
42+
*.out

README.md

Lines changed: 34 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -26,19 +26,25 @@ extensive documentation please visit: https://kat.readthedocs.org/en/latest/
2626

2727
##Installation:
2828

29-
Generic installation description can be found in the INSTALL file. There are two ways to install KAT from source, either by cloning the git repository, or by downloading a distributable package, the later method is generally recommended as it reduces the number of installation steps and dependencies required to be on your system.
30-
31-
Installing from distributable:
32-
- Confirm dependencies are installed and configured:
33-
- GCC V4.8+
34-
- make
35-
- libtool V2.4.2+
36-
- Boost (system,filesystem,program_options,chrono,timer) V1.53+
37-
- Plotting engine:
38-
- Option 1 (preferred) python3, with matplotlib. We recommend installing anaconda3 as this has all the required packages preinstalled.
39-
- Option 2 gnuplot
40-
- Optional - Sphinx documentation V1.3+ (comes with anaconda3)
41-
- Download tarball from https://github.com/TGAC/KAT/releases
29+
There are two ways to install KAT from source, either by cloning the git repository, or by downloading a distributable package, the later method is generally recommended as it reduces the number of installation steps and dependencies required to be on your system.
30+
31+
When installing from distributable first confirm dependencies are installed and configured:
32+
33+
- **GCC** V4.8+
34+
- **make**
35+
- **libtool** V2.4.2+
36+
- **pthreads** (probably already installed)
37+
- **Boost** (*system*,*filesystem*,*program_options*,*chrono*,*timer*) V1.53+. KAT will statically link boost libraries when possible so please make sure you have boost static libraries built on your system.
38+
- **Sphinx-doc** V1.3+ (Optional: only required for building the documentation. Sphinx comes with anaconda3, however if not using anaconda3 then install according to the instructions on the sphinx website: [http://www.sphinx-doc.org/en/stable/instructions](http://www.sphinx-doc.org/en/stable/instructions)))
39+
40+
In addition, KAT can only produce plots if one of the following plotting engines is installed:
41+
42+
- Option 1 (preferred): **python3, with matplotlib**. We recommend installing anaconda3 as this has all the required packages pre-installed, otherwise we need a python3 installation with development libraries and the *scipy* and *numpy* packages installed.
43+
- Option 2: **gnuplot**. This will produce basic plots but will not be as rich and detailed as with python3.
44+
45+
Then proceed with the following steps:
46+
47+
- Download the latest tarball from here: [https://github.com/TGAC/KAT/releases](https://github.com/TGAC/KAT/releases). NOTE: Please make sure not to download the github source code zips. The KAT distributables have the following filename format ```kat-<version>.tar.gz```.
4248
- Decompress and untar: ```tar -xvf kat-<version>.tar.gz```
4349
- Change into directory: ```cd kat-x.x.x```
4450
- Generate makefiles and confirm dependencies: ```./configure```
@@ -47,22 +53,24 @@ Installing from distributable:
4753
- Install: ```sudo make install```
4854

4955

50-
Installing from cloned repository
51-
- Clone the git repository (For ssh: ```git clone [email protected]:TGAC/KAT.git```; or for https: ```git clone https://github.com/TGAC/KAT.git```), into a directory on your machine.
52-
- "cd" into root directory of the installation
56+
Should you wish to install from a cloned git repository instead, do the following:
57+
5358
- Ensure these tools are correctly installed and available on your system:
54-
- autoconf V2.53+
55-
- automake V1.11+
59+
- **autoconf** V2.53+
60+
- **automake** V1.11+
61+
- Clone the git repository (For ssh: ```git clone [email protected]:TGAC/KAT.git```; or for https: ```git clone https://github.com/TGAC/KAT.git```), into a directory on your machine.
62+
- "cd" into root directory of the installation
5663
- Create configuration script by typing: ```./autogen.sh```.
57-
- Follow all steps described in "Installing from a distributable" (except for the download and decompress tarball steps).```
64+
- Follow all steps described in "Installing from a distributable" (except for the download and decompress tarball steps).
5865

5966
The configure script can take several options as arguments. One commonly modified option is ```--prefix```, which will install KAT to a custom directory. By default this is "/usr/local", so the KAT executable would be found at "/usr/local/bin" by default. In addition, some options specific to managing KAT dependencies located in non-standard locations are:
6067

6168
- ```--with-boost``` - for specifying a custom boost directory
6269

6370
Type ```./configure --help``` for full details.
6471

65-
KAT can also make plots. To enable plotting functionality we require either python3,
72+
As already mentioned KAT can also make plots but requires external software to be
73+
available to do this. To enable plotting functionality we require either python3,
6674
with numpy, scipy and matplotlib packages installed. The python installation
6775
must come with the python shared library, on debian systems you can install this
6876
with "sudo apt-get install python3-dev". If you don't already have python3 installed on your system
@@ -71,8 +79,8 @@ you can use gnuplot, although the python plotting method is the preferred method
7179
produce nicer results.
7280

7381
The type of plotting engine used will be determined when running the configure
74-
script, which will select the first engine detected in the following order: python, gnuplot, none.
75-
There is currently no way to select the plotting directory from a custom location,
82+
script, which will select the first engine detected in the following order: python,
83+
gnuplot, none. There is currently no way to select the plotting directory from a custom location,
7684
so the plotting system needs to be properly installed and configured on your
7785
system: i.e. python3 or gnuplot must be available on the PATH.
7886

@@ -106,10 +114,11 @@ GNU GPL V3. See COPYING file for more details.
106114

107115
##Cite:
108116

109-
The KAT paper is currently in submission. In the meantime, if you use our software
110-
and wish to cite us please use our bioRxiv preprint:
117+
If you use KAT in your work and wish to cite us please use the following citation:
111118

112-
Daniel Mapleson et al. 2016. KAT: A K-mer Analysis Toolkit to quality control NGS datasets and genome assemblies. bioRxiv doi: 10.1101/064733
119+
Daniel Mapleson, Gonzalo Garcia Accinelli, George Kettleborough, Jonathan Wright, and Bernardo J. Clavijo.
120+
**KAT: A K-mer Analysis Toolkit to quality control NGS datasets and genome assemblies.**
121+
Bioinformatics, 2016. doi: 10.1093/bioinformatics/btw663
113122

114123

115124
##Authors:

configure.ac

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# Autoconf initialistion. Sets package name version and contact details
66
AC_PREREQ([2.68])
7-
AC_INIT([kat],[2.1.1],[https://github.com/TGAC/KAT/issues],[kat],[https://github.com/TGAC/KAT])
7+
AC_INIT([kat],[2.2.0],[https://github.com/TGAC/KAT/issues],[kat],[https://github.com/TGAC/KAT])
88
AC_CONFIG_SRCDIR([src/kat.cc])
99
AC_CONFIG_AUX_DIR([build-aux])
1010
AC_CONFIG_MACRO_DIR([m4])
@@ -80,28 +80,29 @@ fi
8080
# Plotting
8181
AC_ARG_ENABLE([gnuplot], AS_HELP_STRING([--enable-gnuplot], [Enable gnuplot plotting even if python matplotlib is available]), enable_gnuplot="yes", enable_gnuplot="no")
8282

83-
AM_PATH_PYTHON([], [], AC_MSG_ERROR([Python interpreter not found.]))
83+
#AM_PATH_PYTHON([], [], AC_MSG_ERROR([Python interpreter not found.]))
8484

8585
AX_PYTHON_DEVEL([>= '3.1'])
8686

87+
pybin=python${PYTHON_VERSION}
8788
pymod_good="no"
8889
if [[ -n "${PYTHON_VERSION}" ]]; then
8990
if [[ -z "${PYTHON_EXTRA_LIBS}" ]]; then
9091
pymod_good="no"
9192
AC_MSG_WARN([Python3 detected but Python3 development library was not found. If you wish to use python plotting please install python3 library. e.g. "sudo apt-get install python3-dev" on debian systems.])
9293
fi
9394
pymod_good="yes"
94-
AX_PYTHON_MODULE(numpy, ,python3)
95+
AX_PYTHON_MODULE(numpy, ,${pybin})
9596
if [[ "${PYMOD}" == "no" ]]; then
9697
pymod_good="no"
9798
fi
98-
AX_PYTHON_MODULE(matplotlib, ,python3)
99+
AX_PYTHON_MODULE(matplotlib, ,${pybin})
99100
if [[ "${PYMOD}" == "no" ]]; then
100101
pymod_good="no"
101102
fi
102-
AX_PYTHON_MODULE(scipy, ,python3)
103+
AX_PYTHON_MODULE(scipy, ,${pybin})
103104
if [[ "${PYMOD}" == "no" ]]; then
104-
pymod_good="no"
105+
pymod_good="no"
105106
fi
106107
fi
107108

@@ -149,7 +150,7 @@ AX_BOOST_CHRONO
149150
AX_BOOST_TIMER
150151

151152

152-
define([PC_FILE], lib/kat-2.1.pc)
153+
define([PC_FILE], lib/kat-2.2.pc)
153154

154155

155156
# Combine BOOST variables (apart for BOOST_TEST)
@@ -176,7 +177,7 @@ AC_SUBST([AM_LIBS])
176177

177178

178179
AC_CONFIG_HEADERS([config.h])
179-
AC_CONFIG_FILES([Makefile doc/Makefile doc/source/conf.py lib/kat-2.1.pc lib/Makefile src/Makefile tests/Makefile tests/compat.sh deps/seqan-library-2.0.0/Makefile])
180+
AC_CONFIG_FILES([Makefile doc/Makefile doc/source/conf.py lib/kat.pc lib/Makefile src/Makefile tests/Makefile tests/compat.sh deps/seqan-library-2.0.0/Makefile])
180181
AC_CONFIG_SUBDIRS([deps/jellyfish-2.2.0])
181182
AC_OUTPUT
182183

deps/jellyfish-2.2.0/Makefile.am

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ EXTRA_DIST = doc/kat_jellyfish.pdf doc/kat_jellyfish.man README LICENSE # jellyf
55
man1_MANS = doc/kat_jellyfish.man
66

77
pkgconfigdir = $(libdir)/pkgconfig
8-
pkgconfig_DATA = kat_jellyfish-2.0.pc
8+
pkgconfig_DATA = kat_jellyfish.pc
99

1010
AM_LDFLAGS = -lpthread # $(VALGRIND_LIBS)
1111
AM_CPPFLAGS = -Wall -Wnon-virtual-dtor -Wno-deprecated-declarations -I$(top_srcdir) -I$(top_srcdir)/include -g -O3 $(VALGRIND_CFLAGS)
@@ -30,8 +30,8 @@ YAGGO_SOURCES = # Append all file to be built by yaggo
3030

3131
# What to build
3232
bin_PROGRAMS += bin/kat_jellyfish
33-
lib_LTLIBRARIES = libkat_jellyfish-2.0.la
34-
LDADD = libkat_jellyfish-2.0.la # $(VALGRIND_LIBS)
33+
lib_LTLIBRARIES = libkat_jellyfish.la
34+
LDADD = libkat_jellyfish.la # $(VALGRIND_LIBS)
3535
check_PROGRAMS = bin/generate_sequence
3636

3737
############################
@@ -66,8 +66,8 @@ YAGGO_SOURCES += sub_commands/count_main_cmdline.hpp \
6666
######################################
6767
# Build Jellyfish the shared library #
6868
######################################
69-
libkat_jellyfish_2_0_la_LDFLAGS = -version-info 2:0:0
70-
libkat_jellyfish_2_0_la_SOURCES = lib/rectangular_binary_matrix.cc \
69+
libkat_jellyfish_la_LDFLAGS = -version-info 2:0:0
70+
libkat_jellyfish_la_SOURCES = lib/rectangular_binary_matrix.cc \
7171
lib/mer_dna.cc lib/storage.cc \
7272
lib/allocators_mmap.cc lib/misc.cc \
7373
lib/int128.cc lib/thread_exec.cc \

deps/jellyfish-2.2.0/configure.ac

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ AC_ARG_VAR([YAGGO], [Yaggo switch parser generator])
2727
AS_IF([test "x$YAGGO" = "x"], [AC_PATH_PROG([YAGGO], [yaggo], [false])])
2828

2929
dnl define([concat], $1$2$3)dnl
30-
define([PC_FILE], kat_jellyfish-2.0.pc)
30+
define([PC_FILE], kat_jellyfish.pc)
3131
AC_CONFIG_FILES([
3232
Makefile
3333
tests/compat.sh

deps/jellyfish-2.2.0/kat_jellyfish-2.0.pc.in renamed to deps/jellyfish-2.2.0/kat_jellyfish.pc.in

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ exec_prefix=@exec_prefix@
33
libdir=@libdir@
44
includedir=@includedir@
55

6-
Name: Jellyfish
6+
Name: kat_jellyfish
77
Description: A multi-threaded hash based k-mer counter.
88
Version: @PACKAGE_VERSION@
9-
Libs: -L${libdir} -lkat_jellyfish-2.0 -lpthread
9+
Libs: -L${libdir} -lkat_jellyfish -lpthread
1010
Cflags: -I${includedir}/jellyfish-@PACKAGE_VERSION@

doc/source/conf.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,16 +45,16 @@
4545

4646
# General information about the project.
4747
project = u'kat'
48-
copyright = u'2015, Daniel Mapleson, Bernardo Clavijo, George Kettleborough, Gonzalo Garcia, Jon Wright'
48+
copyright = u'2016, Daniel Mapleson, Bernardo Clavijo, George Kettleborough, Gonzalo Garcia, Jon Wright'
4949

5050
# The version info for the project you're documenting, acts as replacement for
5151
# |version| and |release|, also used in various other places throughout the
5252
# built documents.
5353
#
5454
# The short X.Y version.
55-
version = '2.1.1'
55+
version = '2.2.0'
5656
# The full version, including alpha/beta/rc tags.
57-
release = '2.1.1'
57+
release = '2.2.0'
5858

5959
# The language for content autogenerated by Sphinx. Refer to documentation
6060
# for a list of supported languages.
@@ -99,7 +99,7 @@
9999

100100
# The theme to use for HTML and HTML Help pages. See the documentation for
101101
# a list of builtin themes.
102-
html_theme = 'alabaster'
102+
html_theme = 'nature'
103103

104104
# Theme options are theme-specific and customize the look and feel of a theme
105105
# further. For a list of options available for each theme, see the
@@ -144,7 +144,8 @@
144144
#html_use_smartypants = True
145145

146146
# Custom sidebar templates, maps document names to template names.
147-
#html_sidebars = {}
147+
html_sidebars = { '**': ['globaltoc.html', 'relations.html', 'sourcelink.html', 'searchbox.html'], }
148+
148149

149150
# Additional templates that should be rendered to pages, maps page names to
150151
# template names.

doc/source/conf.py.in

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ master_doc = 'index'
4545

4646
# General information about the project.
4747
project = u'@PACKAGE_TARNAME@'
48-
copyright = u'2015, Daniel Mapleson, Bernardo Clavijo, George Kettleborough, Gonzalo Garcia, Jon Wright'
48+
copyright = u'2016, Daniel Mapleson, Bernardo Clavijo, George Kettleborough, Gonzalo Garcia, Jon Wright'
4949

5050
# The version info for the project you're documenting, acts as replacement for
5151
# |version| and |release|, also used in various other places throughout the
@@ -99,7 +99,7 @@ pygments_style = 'sphinx'
9999

100100
# The theme to use for HTML and HTML Help pages. See the documentation for
101101
# a list of builtin themes.
102-
html_theme = 'alabaster'
102+
html_theme = 'nature'
103103

104104
# Theme options are theme-specific and customize the look and feel of a theme
105105
# further. For a list of options available for each theme, see the
@@ -144,7 +144,8 @@ html_static_path = ['_static']
144144
#html_use_smartypants = True
145145

146146
# Custom sidebar templates, maps document names to template names.
147-
#html_sidebars = {}
147+
html_sidebars = { '**': ['globaltoc.html', 'relations.html', 'sourcelink.html', 'searchbox.html'], }
148+
148149

149150
# Additional templates that should be rendered to pages, maps page names to
150151
# template names.

doc/source/index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Citing
4646
The KAT paper is currently in submission. In the meantime, if you use our software
4747
and wish to cite us please use our bioRxiv preprint:
4848

49-
Daniel Mapleson et al. 2016. KAT: A K-mer Analysis Toolkit to quality control NGS datasets and genome assemblies. bioRxiv doi: 10.1101/064733
49+
`Daniel Mapleson et al. 2016. KAT: A K-mer Analysis Toolkit to quality control NGS datasets and genome assemblies. bioRxiv doi: 10.1101/064733 <http://biorxiv.org/content/early/2016/07/19/064733>`_
5050

5151

5252

@@ -56,7 +56,7 @@ Issues
5656
======
5757

5858
Should you discover any issues with spectre, or wish to request a new feature please raise a `ticket here <https://github.com/TGAC/KAT/issues>`_.
59-
Alternatively, contact Daniel Mapleson at: [email protected]
59+
Alternatively, contact Daniel Mapleson at: [email protected]; or Bernardo Clavijo at: [email protected]
6060

6161

6262
.. _availability:

doc/source/using.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ Applications:
188188
* Basic K-mer spectra visualisation
189189

190190
.. image:: images/ccoli_hist.png
191-
:scale: 20%
191+
:scale: 33%
192192

193193
Density
194194
~~~~~~~
@@ -211,9 +211,9 @@ Applications:
211211

212212

213213
.. image:: images/ccoli_gcp.png
214-
:scale: 20%
214+
:scale: 25%
215215
.. image:: images/ccoli_comp.png
216-
:scale: 20%
216+
:scale: 25%
217217

218218

219219
Profile
@@ -230,7 +230,7 @@ Applications:
230230
* Visualise coverage (and optionally GC) levels across a sequence or set of sequences
231231

232232
.. image:: images/profile.png
233-
:scale: 30%
233+
:scale: 66%
234234

235235

236236
Spectra_CN
@@ -249,7 +249,7 @@ Applications:
249249
* Visualise the copy number spectra of WGS data compared against an assembly
250250

251251
.. image:: images/heterozygous_real.png
252-
:scale: 75%
252+
:scale: 33%
253253

254254

255255

@@ -286,5 +286,5 @@ Applications:
286286
* Visualising k-mer spectra of arbitrary columns and rows from a matrix
287287

288288
.. image:: images/pe_v_pe_1_shared.png
289-
:scale: 50%
289+
:scale: 33%
290290

0 commit comments

Comments
 (0)