Random Notes for Vcsn Developper
Coding Style
Do by imitation: see how things are written elsewhere, and do the same.
When the order does not matter, use the alphabetical order.
Headers are issued by groups separated by an empty line, sorted alphabetically in the group.
C headers
C++ headers
Boost headers
Vcsn headers
If a function builds its return value in a variable, name it
res
.Naming convensions
The short names below applies to local variables. Use meaningful (not hyper-short) names for long-lived entities (e.g., function names, members, etc.).
The printing functions take the printee first, then the stream.
Tools you should know
c++filt
demangles C++ symbol names.valgrind
verifies memory issues.nm
lists the symbols in object files. Usenm -C
to demangle C++ names.ldd
lists the libraries an object file depends upon.
Address Sanitizer
It is possible to use ASAN, but there are quite a few tricky issues.
be sure to pass
-fsanitize=address
inCXX
, notCXXFLAGS
, because otherwiselibtool
tends to move it.ASAN must be loaded from the top-level executable. This is fine for Tools (so
vcsn standard -Ee a
for instance properly loads asan), but it is not for Python, since, it is unlikely that your Python was linked against ASAN. So you need to preload it, e.g.,LD_PRELOAD=/usr/lib/clang/3.8.0/lib/linux/libclang_rt.asan-x86_64.so
.Python and Boost.Python seem to leak. I have tried to use the LSAN (leak sanitizer) suppression files, but with limited success. For a start, I used
ASAN_OPTIONS=detect_leaks=0
to disable LSAN.
Environment variables
VCSN [vcsn
]
The path to the vcsn
binary.
VCSN_COMPILE [$VCSN compile
]
The name/path of the vcsn compile
command.
VCSN_DEBUG
Don't remove temporary files (which is especially useful to keep debug symbols in plugins).
Augment output with debugging information.
dot: display in parens the real state numbers.
is_ambigious: display the couple of states which is outside the diagonal.
proper: Read VCSN_DEBUG as an integer specifying the level of details to dump.
VCSN_DYN
Display information about registration and query about dyn algorithms.
VCSN_FORCE
The vcsn compile
tool avoids useless recompilations. Setting VCSN_FORCE
will make sure all the compilations are donne, even the useless ones.
VCSN_HOME [~/.vcsn
]
Where data is stored at runtime. See VCSN_PLUGINDIR
VCSN_ITERATIVE
Specify that "power" should perform the naive iterative multiplicative approach, instead of the squaring one.
VCSN_PARENS
Force the display of useless parentheses.
VCSN_PATH
The $PATH
to use.
VCSN_PLUGINDIR [$VCSN_HOME/plugins
]
Where the runtime context instantiation are generated and compiled.
VCSN_PRINT
Force a more verbose display of expression (XML like), where one can actually see the nesting of the structures.
VCSN_PYTHONDIR
The python directory.
VCSN_SEED
Disable the generation of a random seed, stick to the compile-time default seed. This ensure that the two successive runs are identical.
VCSN_TMPDIR [/tmp
]
Path to the folder in which compiled contexts should be stored.
VCSN_VERBOSE
When reporting compilation error, include the full compiler error message. If defined as an integer greater than or equal to 2, then dyn
compilations are reported.
VERBOSE
Make the test suite more verbose.
YYDEBUG, YYSCAN
Set to enable Bison parser/ Flex scanner tracing. Can be an integer to denote nesting (which is useful for instance for dot parsing which can fire expression parsing: specify how many layers you want to make verbose).
Tests
To run the test suite, 4 processes in parallel (e.g., if you have 4 threads like with an i7), run
To run some of the tests (e.g., tools/determinize.chk), run
or
The latter will create tests/test-suite.log
and tests/python/determinize.log
, the former only the latter.
Both will run make all
, which can take a while. If you know what you are doing (for instance you want to check something which is compiled at runtime), you can avoid this make all
by running the test by hand:
You may also run only the unit/rat/demo/tools tests:
To create a new test case
Creating a new test case is quite easy: select the proper file (usually the name of the tests/python/*.py
is clear enough, but make sure with git grep ALGO tests/python
if the ALGO is not actually tested in a less obvious place), then do as you see with the other tests.
To create a new test file
Avoid creating too many files. For instance prefix
and suffix
do not deserve two different files, they can both live in prefix.py
for instance.
create the file
tests/python/foo.py
by copying some existing filemake sure it is executable
$ chmod +x tests/python/foo.py
add this file to the repository
$ git add tests/python/foo.py
add this file to the list of tests in
tests/python/local.mk
, respect the alphabetical orderrun just this test to make sure that everything is ok
$ ./_build/35s/tests/bin/vcsn -e tests/python/foo.py
To debug
The script _build/35s/tests/bin/vcsn
defines the environment needed to run a non-installed version of Vcsn. To debug a Python program, use this:
for instance. To run a Tools command under Valgrind:
Note that the debug symbols (at least on OS X) are in the *.o
file, not the *.so
file. And the *.o
is removed when the compilation succeeded. To keep it, be sure to enable VCSN_DEBUG
:
Also, if your build is optimized, some variables might be optimized away. So use VCSN_CXXFLAGS
to change that:
Benchmarks
Be sure to compile in -O3 -DNDEBUG
for benchmarks. Be sure that the tools with compared against (typically OpenFST) are also compiled this way!
OpenFST
When benching against OpenFST, do not use the convenient automaton.fstFOO
functions. For instance:
is completely unfair to OpenFST. Do this instead:
as it converts to FST format only once (and outside the benchmark command), and does not try to convert the result back to Vcsn. Actually, it does not even try to save the result at all (/dev/null
), since Vcsn does not try to save its result either.
IPython
See also the page Widgets to know more about the development of IPython Widgets.
Commits
There is a number of rules to follow:
one purpose per commit
space changes are one purpose, therefore a separate commit
commit should have a title which looks like "topic: what" where
topic is something like
doc
,dyn
,determinize
,expression
,style
(for space changes, etc.)...what is a sentence, starting with a lower case letter, telling what this change is about
commits then follow with text explaining the motivations for the commit
describe the bug it fixes
explain the alternatives and why this one was preferred
show results of
vcsn score
that demonstrate that there is no regression/an improvement
commits then follow with the list of files and a concise description of the changes (with an emphasize on the why rather than paraphrasing the git log).
Be sure to read other commit logs to understand the style and to copy it. Here are a few examples: