Thursday, 21 February 2008

Visualize dependencies of binaries and libraries on Linux

Update2: pfee made some more fixes. The script parses now the dependcies tree correctly using readelf and ldd so that only direct dependencies apear in the graph. The updated version can also be found at dependencies.sh



Update: Thanks to the feedback from pfee, I made some fixes to the script. The script is now also available for direct download dependencies.sh



Sometimes it is useful to know the library dependencies of an application or a library on Linux (or Unix). Especially OpenSource applications depend on lot's of libraries which in turn depend on other libraries again. So it is not always quite clear which dependencies your software has.



Imagine you want to package up your software for a customer and need to know on which libraries your software depends. Usually you know which libraries were used during development, but what are the dependencies of these libraries? You have to package all dependencies so that the customer can use and/or install your software.



I created a bash-script which uses ldd to find the dependencies of a binary on Linux and Graphviz to create a dependency graph out of this information. Benedikt Hauptmann had the idea to show dependencies as a graph - so I cannot take credits for that. Using this script I created the depency graph of TFORMer, the report generator we are developing at TEC-IT. The result is a nice graph showing all the dependencies a user has to have installed before using TFORMer.





Another beautiful graph is the one of PoDoFo. See below the graph of PoDoFo.





The dependencies of Firefox are way more complex than the examples shown above...





If you want to create a graph of your favorite application or library your self, get the script from here. I pulished the simple source code below. Graphviz is the only requirement. Usage is very simple, just pass an application or library as first parameter and the output image as second argument. The script will always create a PNG image:


./dependencies.sh /usr/bin/emacs emacs.png
./dependencies.sh /usr/local/lib/libpodofo.so \
podofo.png



The code of the script is as follows: (Warning: the style sheet cuts of some lines, so better download the script from dependencies.sh)




#!/bin/bash

# This is the maximum depth to which dependencies are resolved
MAXDEPTH=14

# analyze a given file on its
# dependecies using ldd and write
# the results to a given temporary file
#
# Usage: analyze [OUTPUTFILE] [INPUTFILE]
function analyze
{
local OUT=$1
local IN=$2
local NAME=$(basename $IN)

for i in $LIST
do
if [ "$i" == "$NAME" ];
then
# This file was already parsed
return
fi
done
# Put the file in the list of all files
LIST="$LIST $NAME"

DEPTH=$[$DEPTH + 1]
if [ $DEPTH -ge $MAXDEPTH ];
then
echo "MAXDEPTH of $MAXDEPTH reached at file $IN."
echo "Continuing with next file..."
return
fi

echo "Parsing file: $IN"

$READELF $IN &> $READELFTMPFILE
ELFRET=$?

if [ $ELFRET != 0 ];
then
echo "ERROR: ELF reader returned error code $RET"
echo "ERROR:"
cat $TMPFILE
echo "Aborting..."
rm $TMPFILE
rm $READELFTMPFILE
rm $LDDTMPFILE
exit 1
fi

DEPENDENCIES=$(cat $READELFTMPFILE | grep NEEDED | awk '{if (substr($NF,1,1) == "[") print substr($NF, 2, length($NF) - 2); else print $NF}')

for DEP in $DEPENDENCIES;
do
if [ -n "$DEP" ];
then

ldd $IN &> $LDDTMPFILE
LDDRET=$?

if [ $LDDRET != 0 ];
then
echo "ERROR: ldd returned error code $RET"
echo "ERROR:"
cat $TMPFILE
echo "Aborting..."
rm $TMPFILE
rm $READELFTMPFILE
rm $LDDTMPFILE
exit 1
fi

DEPPATH=$(grep $DEP $LDDTMPFILE | awk '{print $3}')
if [ -n "$DEPPATH" ];
then
echo -e " \"$NAME\" -> \"$DEP\";" >> $OUT
analyze $OUT $DEPPATH
fi
fi
done

DEPTH=$[$DEPTH - 1]
}

########################################
# main #
########################################

if [ $# != 2 ];
then
echo "Usage:"
echo " $0 [filename] [outputimage]"
echo ""
echo "This tools analyses a shared library or an executable"
echo "and generates a dependency graph as an image."
echo ""
echo "GraphViz must be installed for this tool to work."
echo ""
exit 1
fi

DEPTH=0
INPUT=$1
OUTPUT=$2
TMPFILE=$(mktemp -t)
LDDTMPFILE=$(mktemp -t)
READELFTMPFILE=$(mktemp -t)
LIST=""

if [ ! -e $INPUT ];
then
echo "ERROR: File not found: $INPUT"
echo "Aborting..."
exit 2
fi

# Use either readelf or dump
# Linux has readelf, Solaris has dump
READELF=$(type readelf 2> /dev/null)
if [ $? != 0 ]; then
READELF=$(type dump 2> /dev/null)
if [ $? != 0 ]; then
echo Unable to find ELF reader
exit 1
fi
READELF="dump -Lv"
else
READELF="readelf -d"
fi



echo "Analyzing dependencies of: $INPUT"
echo "Creating output as: $OUTPUT"
echo ""

echo "digraph DependencyTree {" > $TMPFILE
echo " \"$(basename $INPUT)\" [shape=box];" >> $TMPFILE
analyze $TMPFILE "$INPUT"
echo "}" >> $TMPFILE

#cat $TMPFILE # output generated dotfile for debugging purposses
dot -Tpng $TMPFILE -o$OUTPUT

rm $LDDTMPFILE
rm $TMPFILE

exit 0

18 comments:

Kleag said...

Note that if you want to view the graph in an easy to use application, you can use the KDE app kgraphviewer (from extragear). You can even integrate the viewer as a KPart in your app.

pfee said...

Nice idea, some bugs:

54: ldd returned error code $?"
Should be "error code $RET" since $? will have changed at this point. $RET correctly remembers ldd errors from line 51.

63: Don't call "analyze" on ldd results which don't refer to files.
Hence change the line to:
DEPENDENCIES=$(cat $LDDTMPFILE | awk -F " " '{ if (!match($3, /\(.*\)/)) print $3; }')

The first time I tried the script, I used an executable that had ldd output with these lines:

linux-gate.so.1 => (0xffffe000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d9c000)
/lib/ld-linux.so.2 (0xb7f66000)

The analyze function should only be passed valid filenames.

Dominik Seichter said...

Thanks for your comments!
I uploaded a fixed script. Just compili ng kgraphviewer - looks very interesting! Thanks for the tip.

Ciao,
Dom

Dave Foster said...

Hi - cool script. However, every binary I ran it on errored, the ending always looked like this:

"libobparser" -> "libxml2";
"libxml2" -> "libdl";
"libxml2" -> "libz";
"libz" -> "you";

the "you" looks very problematic. Any idea where its coming from?

Dominik Seichter said...

The reason seems to be ldd output like this:

The reason is the following i guess:
% ldd /lib/libgcc_s.so.1
ldd: warning: you do not have execution permission for `/lib/libgcc_s.so.1'
linux-vdso.so.1 => (0x00007fffdbffe000)
libc.so.6 => /lib/libc.so.6 (0x00002b28cedc6000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)


So the "you" comes from "you do not have execution permissions". I am not sure were this comes from as I can run ldd on every binary on my system. At least one other user has reported similar issues, but not with all executables. Does running as root help?

Dominik Seichter said...

I uploaded a new version of the script to http://krename.sf.net/data/scripts/dependencies.sh which should fix your problem. Thanks for reporting.

Dom

pfee said...

Some more improvements, this time to make it work on Solaris.

Solaris awk doesn't like a space between -F and the following delimiter. Also the "match" function I suggested yesterday is not available with Solaris awk, therefore here's another way to cope with ldd output of the form:

linux-gate.so.1 => (0xffffe000)

Instead of rejecting a regular expression to detect a starting character "(" and an ending character ")", it's probably enough just to reject strings starting with "(".

The changes:
Line 10: remove space between -F and "."

Line 60: remove space between -F and " ".
Line 60: change "!match($3, /\(.*\)/)" to "substr($3, 1, 1) != "("".

The complete line 60 is now:
DEPENDENCIES=$(cat $LDDTMPFILE | awk -F" " '{ if (substr($3, 1, 1) != "(" && substr( $0, 1, 13 ) != "ldd: warning:") print $3; }')

With these changes the script works on Linux and Solaris.

pfee said...

Another thought - this script gives the wrong results. The output of ldd has been misinterpreted.

ldd reports both immediate dependencies and recursively follows these to find subsequent dependencies.

This script also recursively looks for sub-dependencies. Therefore the resulting graph has too many links between libraries.

Instead a tool which can give you only immediate dependencies is needed, such as readelf.

Example output from ldd:
$ ldd /bin/ls
linux-gate.so.1 => (0xffffe000)
librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7ed7000)
libacl.so.1 => /lib/libacl.so.1 (0xb7ed0000)
libselinux.so.1 => /lib/libselinux.so.1 (0xb7eb9000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d6f000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7d57000)
/lib/ld-linux.so.2 (0xb7efc000)
libattr.so.1 => /lib/libattr.so.1 (0xb7d53000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7d4f000)
libsepol.so.1 => /lib/libsepol.so.1 (0xb7d0e000)

However from readelf the direct dependencies of /bin/ls are much less:

readelf -d /bin/ls | grep NEEDED
0x00000001 (NEEDED) Shared library: [librt.so.1]
0x00000001 (NEEDED) Shared library: [libacl.so.1]
0x00000001 (NEEDED) Shared library: [libselinux.so.1]
0x00000001 (NEEDED) Shared library: [libc.so.6]

Therefore the graphviz PNG should only show four links starting at /bin/ls instead of the eight we get with the current script.

We'll need to use readelf to get the direct dependencies. Then we'll need to use ldd to find the path at which this dependency can be found. Then proceed with recursive analysis as before.

Now all that's left is to implement the change.

Dave Foster said...

You've fixed my problem, cheers. I'll still watch the comments, this is a neat utility.

pfee said...

Further fixes available at:
http://rafb.net/p/N7a07F26.nln.html

Please copy from here before the page expires.

I've removed -F " " from awk statements since this is the default delimiter already.

The graph now shows direct dependencies. This can remove a lot of incorrect links making the graphs much clearer and more accurate.

I've used "readelf" on Linux and "dump" on Solaris.

Dominik Seichter said...

pfee, Thanks for your work. I uploaded your fixed version of the script. How could I miss that ldd gives the wrong results? Anyways, your new versions seems to be correct now! Thanks for helping with this.

best regards,
Dom

pfee said...

Hi Dominik,

Now that the script gives more concise graphs, it would be interesting to rerun your examples.

Currently the application's node has a link to every library. Since it should only have a link to its direct dependencies, rerunning the script would make the diagrams clearer.

Here are some more improvement ideas.

Firstly, try to avoid the temporary files. Perhaps the output of the child processes (ldd and readelf) could be captured in variables. This removes the need to clean up the files on exit and seems generally better to me.

Secondly, make the second parameter, the output PNG filename optional. If no filename is supplied, look around the system for a suitable image display program and check if the DISPLAY environment variable is set.

If these conditions are met, then bring the image up on screen immediately. This will speed up the user's experience as it's most likely that they'll view the image immediately after creating it.

If a viewer isn't available, then output an error message guiding the user to supply a second parameter which will then cause PNG output to disk as normal.

With the ability to go straight from ELF file (i.e. the application or shared library) to an on screen image. It would then be nice to integrate this with KDE so that right clicking on ELF files give the option to throw up the dependency graph. I'm not too familiar with KDE filetype/application mappings but I don't think this step would be too difficult.

Thanks,
Paul

stacy said...

This is my Good luck that I found your post which is according to my search and topic, I think you are a great blogger, thanks for helping me outta my problem..
Dissertation Writing Service

heobeo said...

i love Linux >"<

maos do pokerNational Lottery Syndicate

josiem said...

Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with more information? It is extremely helpful for me. Cell Phone Lookup

uknowme said...

If a viewer isn't available, then output an error message guiding the user to supply a second parameter which will then cause PNG output to disk as normal
hello kitty frames for eyeglasses

Vascular table

glenn tanner said...

Great tool! However, it looks like the program leaves out absolute path libraries. Please let me know if i'm wrong by making this change as I plan to further customize the script to copy programs into an initrd with all needed libs.

DEPPATH=$(grep $DEP $LDDTMPFILE | awk '{if ((NF==4) && ($3 ~ /^\//)) {print $3} else if ((NF==2) && ($1 ~ /^\//)) {print $1}}')

#from pfee's post
linux-gate.so.1 => (0xffffe000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d9c000)
/lib/ld-linux.so.2 (0xb7f66000) <--skipped

kenzie jones said...

Thanks for the post. It really help me a lot in my project. One more thing I want to say is the feedback given by other peoples also help me a lot.
electronic signature