Thursday, 22 October 2015

Visualize dependencies of binaries and libraries on Linux

Update4: Albert Astals Cid mentioned that KDE maintains also a version of that script: draw_lib_dependencies

Update3: Marco Nelissen fixed an issue, that caused dependency resolution to break, as soon MAXDEPTH was reached once. This issue was fixed now and I am quite happy that this old script is still useful and even get's improved. The updated version can also be found at dependencies.sh. Version below fixed as well.

Update2: pfee made some more fixes. The script parses now the dependcies tree correctly using readelf and ldd so that only direct dependencies apear in the graph. The updated version can also be found at dependencies.sh


Update: Thanks to the feedback from pfee, I made some fixes to the script. The script is now also available for direct download dependencies.sh


Sometimes it is useful to know the library dependencies of an application or a library on Linux (or Unix). Especially OpenSource applications depend on lot's of libraries which in turn depend on other libraries again. So it is not always quite clear which dependencies your software has.


Imagine you want to package up your software for a customer and need to know on which libraries your software depends. Usually you know which libraries were used during development, but what are the dependencies of these libraries? You have to package all dependencies so that the customer can use and/or install your software.


I created a bash-script which uses ldd to find the dependencies of a binary on Linux and Graphviz to create a dependency graph out of this information. Benedikt Hauptmann had the idea to show dependencies as a graph - so I cannot take credits for that. Using this script I created the depency graph of TFORMer, the report generator we are developing at TEC-IT. The result is a nice graph showing all the dependencies a user has to have installed before using TFORMer.




Another beautiful graph is the one of PoDoFo. See below the graph of PoDoFo.




The dependencies of Firefox are way more complex than the examples shown above...




If you want to create a graph of your favorite application or library your self, get the script from here. I pulished the simple source code below. Graphviz is the only requirement. Usage is very simple, just pass an application or library as first parameter and the output image as second argument. The script will always create a PNG image:
./dependencies.sh /usr/bin/emacs emacs.png
./dependencies.sh /usr/local/lib/libpodofo.so \
                  podofo.png



The code of the script is as follows: (Warning: the style sheet cuts of some lines, so better download the script from dependencies.sh)


#!/bin/bash
 
# This is the maximum depth to which dependencies are resolved
MAXDEPTH=14
 
# analyze a given file on its
# dependecies using ldd and write
# the results to a given temporary file
#
# Usage: analyze [OUTPUTFILE] [INPUTFILE]
function analyze
{
    local OUT=$1
    local IN=$2
    local NAME=$(basename $IN)
 
    for i in $LIST
    do
        if [ "$i" == "$NAME" ];
        then
            # This file was already parsed
            return
        fi
    done
    # Put the file in the list of all files
    LIST="$LIST $NAME"
 
    DEPTH=$[$DEPTH + 1]
    if [ $DEPTH -ge $MAXDEPTH ];
        then
        echo "MAXDEPTH of $MAXDEPTH reached at file $IN."
        echo "Continuing with next file..."
 # Fix by Marco Nelissen for the case that MAXDEPTH was reached
 DEPTH=$[$DEPTH - 1]
        return
    fi
 
    echo "Parsing file:              $IN"
 
    $READELF $IN &> $READELFTMPFILE
    ELFRET=$?
 
    if [ $ELFRET != 0 ];
        then
        echo "ERROR: ELF reader returned error code $RET"
        echo "ERROR:"
        cat $TMPFILE
        echo "Aborting..."
        rm $TMPFILE
        rm $READELFTMPFILE
        rm $LDDTMPFILE
        exit 1
    fi
 
    DEPENDENCIES=$(cat $READELFTMPFILE | grep NEEDED | awk '{if (substr($NF,1,1) == "[") print substr($NF, 2, length($NF) - 2); else print $NF}')
 
    for DEP in $DEPENDENCIES;
    do
        if [ -n "$DEP" ];
        then
 
            ldd $IN &> $LDDTMPFILE
            LDDRET=$?
 
            if [ $LDDRET != 0 ];
                then
                echo "ERROR: ldd returned error code $RET"
                echo "ERROR:"
                cat $TMPFILE
                echo "Aborting..."
                rm $TMPFILE
                rm $READELFTMPFILE
                rm $LDDTMPFILE
                exit 1
            fi
 
            DEPPATH=$(grep $DEP $LDDTMPFILE | awk '{print $3}')
            if [ -n "$DEPPATH" ];
            then
                echo -e "  \"$NAME\" -> \"$DEP\";" >> $OUT
                analyze $OUT $DEPPATH
            fi
        fi
    done
 
    DEPTH=$[$DEPTH - 1]
}
 ########################################
# main                                 #
########################################
 if [ $# != 2 ];
    then
    echo "Usage:"
    echo "  $0 [filename] [outputimage]"
    echo ""
    echo "This tools analyses a shared library or an executable"
    echo "and generates a dependency graph as an image."
    echo ""
    echo "GraphViz must be installed for this tool to work."
    echo ""
    exit 1
fi
 DEPTH=0
INPUT=$1
OUTPUT=$2
TMPFILE=$(mktemp -t)
LDDTMPFILE=$(mktemp -t)
READELFTMPFILE=$(mktemp -t)
LIST=""
 if [ ! -e $INPUT ];
    then
    echo "ERROR: File not found: $INPUT"
    echo "Aborting..."
    exit 2
fi
 # Use either readelf or dump
# Linux has readelf, Solaris has dump
READELF=$(type readelf 2> /dev/null)
if [ $? != 0 ]; then
  READELF=$(type dump 2> /dev/null)
  if [ $? != 0 ]; then
    echo Unable to find ELF reader
    exit 1
  fi
  READELF="dump -Lv"
else
  READELF="readelf -d"
fi
 
 
 
echo "Analyzing dependencies of: $INPUT"
echo "Creating output as:        $OUTPUT"
echo ""
 
echo "digraph DependencyTree {" > $TMPFILE
echo "  \"$(basename $INPUT)\" [shape=box];" >> $TMPFILE
analyze $TMPFILE "$INPUT"
echo "}" >> $TMPFILE
 #cat $TMPFILE # output generated dotfile for debugging purposses
dot -Tpng $TMPFILE -o$OUTPUT
 
rm $LDDTMPFILE
rm $TMPFILE
 exit 0

23 comments:

Kleag said...

Note that if you want to view the graph in an easy to use application, you can use the KDE app kgraphviewer (from extragear). You can even integrate the viewer as a KPart in your app.

Anonymous said...

Nice idea, some bugs:

54: ldd returned error code $?"
Should be "error code $RET" since $? will have changed at this point. $RET correctly remembers ldd errors from line 51.

63: Don't call "analyze" on ldd results which don't refer to files.
Hence change the line to:
DEPENDENCIES=$(cat $LDDTMPFILE | awk -F " " '{ if (!match($3, /\(.*\)/)) print $3; }')

The first time I tried the script, I used an executable that had ldd output with these lines:

linux-gate.so.1 => (0xffffe000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d9c000)
/lib/ld-linux.so.2 (0xb7f66000)

The analyze function should only be passed valid filenames.

Dominik Seichter said...

Thanks for your comments!
I uploaded a fixed script. Just compili ng kgraphviewer - looks very interesting! Thanks for the tip.

Ciao,
Dom

daf said...

Hi - cool script. However, every binary I ran it on errored, the ending always looked like this:

"libobparser" -> "libxml2";
"libxml2" -> "libdl";
"libxml2" -> "libz";
"libz" -> "you";

the "you" looks very problematic. Any idea where its coming from?

Dominik Seichter said...

The reason seems to be ldd output like this:

The reason is the following i guess:
% ldd /lib/libgcc_s.so.1
ldd: warning: you do not have execution permission for `/lib/libgcc_s.so.1'
linux-vdso.so.1 => (0x00007fffdbffe000)
libc.so.6 => /lib/libc.so.6 (0x00002b28cedc6000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)


So the "you" comes from "you do not have execution permissions". I am not sure were this comes from as I can run ldd on every binary on my system. At least one other user has reported similar issues, but not with all executables. Does running as root help?

Dominik Seichter said...

I uploaded a new version of the script to http://krename.sf.net/data/scripts/dependencies.sh which should fix your problem. Thanks for reporting.

Dom

Anonymous said...

Some more improvements, this time to make it work on Solaris.

Solaris awk doesn't like a space between -F and the following delimiter. Also the "match" function I suggested yesterday is not available with Solaris awk, therefore here's another way to cope with ldd output of the form:

linux-gate.so.1 => (0xffffe000)

Instead of rejecting a regular expression to detect a starting character "(" and an ending character ")", it's probably enough just to reject strings starting with "(".

The changes:
Line 10: remove space between -F and "."

Line 60: remove space between -F and " ".
Line 60: change "!match($3, /\(.*\)/)" to "substr($3, 1, 1) != "("".

The complete line 60 is now:
DEPENDENCIES=$(cat $LDDTMPFILE | awk -F" " '{ if (substr($3, 1, 1) != "(" && substr( $0, 1, 13 ) != "ldd: warning:") print $3; }')

With these changes the script works on Linux and Solaris.

Anonymous said...

Another thought - this script gives the wrong results. The output of ldd has been misinterpreted.

ldd reports both immediate dependencies and recursively follows these to find subsequent dependencies.

This script also recursively looks for sub-dependencies. Therefore the resulting graph has too many links between libraries.

Instead a tool which can give you only immediate dependencies is needed, such as readelf.

Example output from ldd:
$ ldd /bin/ls
linux-gate.so.1 => (0xffffe000)
librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7ed7000)
libacl.so.1 => /lib/libacl.so.1 (0xb7ed0000)
libselinux.so.1 => /lib/libselinux.so.1 (0xb7eb9000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d6f000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7d57000)
/lib/ld-linux.so.2 (0xb7efc000)
libattr.so.1 => /lib/libattr.so.1 (0xb7d53000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7d4f000)
libsepol.so.1 => /lib/libsepol.so.1 (0xb7d0e000)

However from readelf the direct dependencies of /bin/ls are much less:

readelf -d /bin/ls | grep NEEDED
0x00000001 (NEEDED) Shared library: [librt.so.1]
0x00000001 (NEEDED) Shared library: [libacl.so.1]
0x00000001 (NEEDED) Shared library: [libselinux.so.1]
0x00000001 (NEEDED) Shared library: [libc.so.6]

Therefore the graphviz PNG should only show four links starting at /bin/ls instead of the eight we get with the current script.

We'll need to use readelf to get the direct dependencies. Then we'll need to use ldd to find the path at which this dependency can be found. Then proceed with recursive analysis as before.

Now all that's left is to implement the change.

daf said...

You've fixed my problem, cheers. I'll still watch the comments, this is a neat utility.

Anonymous said...

Further fixes available at:
http://rafb.net/p/N7a07F26.nln.html

Please copy from here before the page expires.

I've removed -F " " from awk statements since this is the default delimiter already.

The graph now shows direct dependencies. This can remove a lot of incorrect links making the graphs much clearer and more accurate.

I've used "readelf" on Linux and "dump" on Solaris.

Dominik Seichter said...

pfee, Thanks for your work. I uploaded your fixed version of the script. How could I miss that ldd gives the wrong results? Anyways, your new versions seems to be correct now! Thanks for helping with this.

best regards,
Dom

Anonymous said...

Hi Dominik,

Now that the script gives more concise graphs, it would be interesting to rerun your examples.

Currently the application's node has a link to every library. Since it should only have a link to its direct dependencies, rerunning the script would make the diagrams clearer.

Here are some more improvement ideas.

Firstly, try to avoid the temporary files. Perhaps the output of the child processes (ldd and readelf) could be captured in variables. This removes the need to clean up the files on exit and seems generally better to me.

Secondly, make the second parameter, the output PNG filename optional. If no filename is supplied, look around the system for a suitable image display program and check if the DISPLAY environment variable is set.

If these conditions are met, then bring the image up on screen immediately. This will speed up the user's experience as it's most likely that they'll view the image immediately after creating it.

If a viewer isn't available, then output an error message guiding the user to supply a second parameter which will then cause PNG output to disk as normal.

With the ability to go straight from ELF file (i.e. the application or shared library) to an on screen image. It would then be nice to integrate this with KDE so that right clicking on ELF files give the option to throw up the dependency graph. I'm not too familiar with KDE filetype/application mappings but I don't think this step would be too difficult.

Thanks,
Paul

heobeo said...

i love Linux >"<

maos do pokerNational Lottery Syndicate

josiem said...

Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with more information? It is extremely helpful for me. Cell Phone Lookup

uknowme said...

If a viewer isn't available, then output an error message guiding the user to supply a second parameter which will then cause PNG output to disk as normal
hello kitty frames for eyeglasses

Vascular table

Unknown said...

Great tool! However, it looks like the program leaves out absolute path libraries. Please let me know if i'm wrong by making this change as I plan to further customize the script to copy programs into an initrd with all needed libs.

DEPPATH=$(grep $DEP $LDDTMPFILE | awk '{if ((NF==4) && ($3 ~ /^\//)) {print $3} else if ((NF==2) && ($1 ~ /^\//)) {print $1}}')

#from pfee's post
linux-gate.so.1 => (0xffffe000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d9c000)
/lib/ld-linux.so.2 (0xb7f66000) <--skipped

Unknown said...

Thanks for the post. It really help me a lot in my project. One more thing I want to say is the feedback given by other peoples also help me a lot.
electronic signature

Farko777 said...

Where can I find the latest version of this script?

Unknown said...

thanks a lot!

Albert Astals Cid said...

We're maintaining it in KDE at https://quickgit.kde.org/?p=kde-dev-scripts.git&a=blob&h=c54fefcb02ee1036e7d1c5e8d935264af0280a95&hb=900952ec2103e1d6e417b41d87288eac598294ea&f=draw_lib_dependencies but it seems it has changed quite a lot, would you suggest just overwriting it with your new version? or?

SomJura said...

Thanks for the app, it seems great. However I constantly receive an error:
``ERROR: ldd returned error code 1
ERROR:
digraph DependecyTree {
p4v [shape=box];
Aborting...``

Do you know how to fix this?

Thank you very much!!

Dominik Seichter said...

Hi Albert,

Good to hear that also KDE uses this. I think you are also affected by that problem, but the fix is easy. At line 60 you have to decrement DEPTH by one.

DEPTH=$(($DEPTH + 1))
if [ $DEPTH -ge $MAXDEPTH ];
then
echo "MAXDEPTH of $MAXDEPTH reached at file $IN."
echo "Continuing with next file..."
+ DEPTH=$(($DEPTH - 1))
return
fi

I recommend you commit this fix.

Albert Astals Cid said...

Fix pushed! thanks :)