howto at price.mit.edu/blog

I wrote another post last week for the Ksplice blog: Strace — The Sysadmin’s Microscope. If you’re running a Linux system, or just writing or maintaining a complex program, sometimes strace is indispensable — it’s the tool that tells you what a program is really doing. My post explains why strace is so good at showing the interesting events in a program (hint: it gets to sit between the program and everything else in the universe), describes some of its key options, and shows a few ways you can use it to solve problems.

Unfortunately there’s only so much you can say in a blog post of reasonable length, so I had to cut some of my favorite uses down to bullet points. Here’s one such use, which I can’t bear to keep off of the Web, just because I thought I was so clever when I came up with it in real life a couple of months ago.

(If you haven’t already, I encourage you to go read the main post first. I’ll be here when you come back.)

Strace As A Progress Bar

Sometimes you start a command, and it turns out to take forever. It’s been three hours, and you don’t know if it’s going to be another three hours, or ten minutes, or a day.

This is what progress bars were invented for. But you didn’t know this command was going to need a progress bar when you started it.

Strace to the rescue. What work is your program doing? If it’s touching anything in the filesystem while it works, or anything on the network, then strace will tell you exactly what it’s up to. And in a lot of cases, you can deduce how far into its job it’s gotten.

For example, suppose our program is walking a big directory tree and doing something slow. Let’s simulate that with a synthetic directory tree and a find that just sleeps for each directory:

  $ mkdir tree && cd tree
  $ for i in $(seq 1000); do mktemp -d -p .; done >/dev/null
  $ find . -exec sleep 1 \;

Well, this is taking a while. Let’s open up another terminal and ask strace what’s going on:

  $ pgrep find
  2714
  $ strace -p 2714
  [...]
  open("tmp.HvbzfbbWSa", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
  fstat(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
  fchdir(5)                               = 0
  getdents(5, /* 2 entries */, 4096)      = 48
  getdents(5, /* 0 entries */, 4096)      = 0
  close(5)                                = 0
  open("..", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 5
  fstat(5, {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
  fchdir(5)                               = 0
  close(5)                                = 0
  newfstatat(AT_FDCWD, "tmp.MiHDWiBURu", {st_mode=02, st_size=17592186044416, ...}, AT_SYMLINK_NOFOLLOW) = 0
  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb19c92a770) = 13044
  wait4(13044, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 13044
  --- SIGCHLD (Child exited) @ 0 (0) ---
  open("tmp.MiHDWiBURu", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
  fstat(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
  fchdir(5)                               = 0
  [...]

The find just looked at tmp.HvbzfbbWSa, and now it’s going into tmp.MiHDWiBURu. How far is that into the total? ls will tell us the list of directories that the find is working from; we just have to tell it to give them to us in the raw, unsorted order that the parent directory lists them in, with the -U flag. And then grep -n will tell us where in that list the entry tmp.HvbzfbbWSa appears:

  $ ls -U | grep -n tmp.HvbzfbbWSa
  258:tmp.HvbzfbbWSa
  $ ls -U | grep -n tmp.MiHDWiBURu
  259:tmp.MiHDWiBURu
  $ ls -U | wc -l
  1000

So tmp.HvbzfbbWSa is entry 258 out of 1000 entries in this directory — we’re 25.8% of the way there. If it’s been four minutes so far, then we should expect about twelve more minutes to go.

(But With The Benefit Of Foresight…)

I’d be remiss if I taught you this hackish approach without mentioning that if you realize you want a progress bar before you start the command, you can do it much better — after all, the ‘progress bar’ above doesn’t even have a bar, except in your head.

Check out pv, the pipe viewer. In my little example, you’d have the command itself print out where it is, like so:

  $ find . -exec sh -c 'echo $1 && sleep 1' -- \{\} \;
  .
  ./tmp.BToqLElOGC
  ./tmp.xnuhzmGbOP
  [...]

and then you could get a real, live, automatically-updated progress bar, like so:

  $ find . -exec sh -c 'echo $1 && sleep 1' -- \{\} \; \
     | pv --line-mode --size=$(find . | wc -l) >/dev/null
   175 0:02:57 [0.987/s ] [=====>                               ] 17% ETA 0:13:55

Here we’ve passed --line-mode to make pv count lines instead of its default of bytes, and --size with an argument to tell it how many lines to expect in total. Even if you can’t estimate the size, pv will cheerfully tell you how far you’ve gone, how long it’s been, and how fast it’s moving right now, which can still be handy. pv is a pretty versatile tool in its own right — explaining all the ways to use it could be another whole blog post. But the pv man page is a good start.

That’s Just One

There’s lots of other ways to use strace — starting with the two I described in my main post, and the three more, besides this one, that I only mentioned there. I don’t really know anymore how I used to manage without it.

Liked this post? Subscribe and keep ‘em coming.

Sometimes a MySQL query doesn’t die when you think it does.

If you’ve spent much time with MySQL, you’ve probably tried a query from the mysql command line, changed your mind after it didn’t return for a while, and hit C-c:

$ mysql -h youtomb-sql.mit.edu -u guest youtomb

mysql> SELECT COUNT(DISTINCT status) FROM artifacts;
^CCtrl-C -- sending "KILL QUERY 183" to server ...
Ctrl-C -- query aborted.
ERROR 1317 (70100): Query execution was interrupted

That kills the query. But notice what the mysql program is telling you after you hit that C-c: it’s sending a separate command, namely the “query” “KILL QUERY 183“, to the server.

In fact, that KILL query is the only way to get the MySQL server to stop running a query. In particular, the MySQL server is really bad at noticing when a client goes away. Suppose instead of hitting C-c, which the mysql program traps and handles in a smart way, I simply kill the program by hitting C-\:

mysql> SELECT COUNT(DISTINCT status) FROM artifacts;
^\Aborted

Then in fact the server keeps running the query. If I fire up the MySQL client anew and issue the query SHOW PROCESSLIST, I can see the query still chugging away:

$ mysql -h youtomb-sql.mit.edu -u guest youtomb

mysql> SHOW PROCESSLIST \G
*************************** 1. row ***************************
     Id: 183
   User: guest
   Host: OPUS.MIT.EDU:37938
     db: youtomb
Command: Query
   Time: 4
  State: Sending data
   Info: SELECT COUNT(DISTINCT status) FROM artifacts
*************************** 2. row ***************************
     Id: 185
   User: guest
   Host: OPUS.MIT.EDU:37940
     db: youtomb
Command: Query
   Time: 0
  State: NULL
   Info: SHOW PROCESSLIST
2 rows in set (0.00 sec)

That \G is an alternative to the semicolon that makes the format more readable here.

If I want to actually kill the query, I can do it with the same meta-query my client used automatically upon C-c:

mysql> KILL QUERY 183;
Query OK, 0 rows affected (0.02 sec)

Et voilà:

mysql> SHOW PROCESSLIST \G
*************************** 1. row ***************************
     Id: 185
   User: guest
   Host: OPUS.MIT.EDU:37940
     db: youtomb
Command: Query
   Time: 0
  State: NULL
   Info: SHOW PROCESSLIST
1 row in set (0.00 sec)

Now, why would you or I care? After all, nobody in their right mind goes about hitting control-backslash or employing equally messy means to kill their MySQL clients. And control-C behaves just as you’d hope — so long as you are using the mysql command-line client.

Where the story isn’t so good is on a typical other client program. The KILL behavior on control-C is a feature of the mysql program, not of the MySQL C API. (If you think about it, it involves installing a signal handler — not something a well-behaved library will just do.) And because it’s not a feature of the MySQL C API, it’s probably not a feature of your favorite language’s MySQL bindings, which wrap that API. In particular, I know it’s not a feature of MySQLdb, the leading Python bindings.

So suppose you write a Python script to do some MySQL queries… and you have a big honking table in your database, and you write an inefficient query… and the query planner resorts to copying most of the table to a temporary table… and after a couple of hours you kill the Python script with control-C or kill or some other means because it’s taking forever. The query will keep running. And the next day maybe it’s copied enough that it fills up your disk, and the database has an outage.

I wish that were a hypothetical. Fortunately, the MySQL server will then remove the temporary table and the disk will have space again. If you’re lucky, the server will even come back up.

Lesson: when you want to kill a MySQL query, make sure it dies. Use SHOW PROCESSLIST to check and KILL QUERY to kill.

price.mit.edu/blog

Archive for the ‘howto’ tag

Strace: The Lost Chapter

Strace As A Progress Bar

(But With The Benefit Of Foresight…)

That’s Just One

Preventing SQL Undeath (Killing a MySQL Query For Real)

Archives

Subscribe in a Reader

Subscribe via Email

Recent Comments

Twitter