Saturday, November 01, 2014

Editing ad-hoc config files on Linux

I had a need to edit /etc/network/interfaces file on Ubuntu (from my program). At first I found some awk script on the web, which claimed to do the job, but when I tried, it wasn't addressing all the CRUD cases that I am interested in (also, I didn't want to have system() calls from my C code).
So, I searched for a better utility, and found Augeas - this is fantastic! you have to try it to believe! It has a neat command line utility (augtool) as well as an easy to use C API (+ bindings for many scripting languages, including my fav Lua too :-) ).

The following commands show how easy it is to add/rm an interface (XPATH expressions)

$ sudo augtool # Add an interface at the end (last)
augtool> set /files/etc/network/interfaces/iface[last()+1] eth1
augtool> set /files/etc/network/interfaces/iface[last()]/family inet
augtool> set /files/etc/network/interfaces/iface[last()]/method static
augtool> set /files/etc/network/interfaces/iface[last()]/address 10.1.1.1
augtool> save
Saved 1 file(s)
augtool> 


$ sudo augtool  # Edit the added interface (by name, not position)
augtool> set /files/etc/network/interfaces/iface[. = 'eth1']/netmask 255.255.255.0
augtool> save
Saved 1 file(s)
augtool> set /files/etc/network/interfaces/iface[. = 'eth1']/mtu 1500
augtool> save
Saved 1 file(s)
augtool> 

$ sudo cat /etc/network/interfaces 
auto lo
iface lo inet loopback
iface eth1 inet dhcp
   address 10.1.1.1
   netmask 255.255.255.0
   mtu 1500

$ sudo augtool # Lets just delete eth1 now
augtool> rm /files/etc/network/interfaces/iface[. = 'eth1']
rm : /files/etc/network/interfaces/iface[. = 'eth1'] 6  <-- 6="" augtool="" fields="" removed=""> save
Saved 1 file(s)
augtool> 

Now, the same/similar exercise programmatically:

Monday, October 27, 2014

Elixir: Functional |> Concurrent |> Pragmatic |> Fun

Since I have been subscribed to Pragmatic Bookshelf, one book which I have been waiting to read (for months), is Programming Elixir: Functional |> Concurrent |> Pragmatic |> Fun by Dave Thomas (remember The Pragmatic Programmer?).
There seems to be a lot of excitement around the language, and books being written even before the official release of the language is out. Elixir is developed by Jose Valim - a core developer of Rails, so the cool features of Ruby are expected.

Erlang (concurrent, reliable but hard!) is still popular, and has been used to write many cool (and stable) software, some popular - WhatsApp, OTP, and some well known only in the networking world - ConfD. Here is a nice commentary on Elixir by Joe Armstrong, creator of Erlang
http://joearms.github.io/2013/05/31/a-week-with-elixir.html

Amazon says the book arrives on 30th - I have signed-up to be notified! cant wait to get the book! (and to get a hold on concurrent programming on the Erlang VM) :)

Wednesday, August 27, 2014

Simple asynchronous task spawn from shell

Yesterday was a hard day at work! I was meddling with some shell (bash) script, where I had to make a CLI work; a CLI command which would restart self (the whole set of services, including the ones who spawned the CLI).

That was somehow not working! before the script could restart the services, it would get terminated! I did a lot of reading, and learnt lot many bash tricks in a day that I haven't learnt in a long time! :)

Now, for this issue, what I needed was a way to tell someone "Please restart me" because "I'm not able to restart myself!" - ah! asynchronous task/job.

I already had a daemon to execute a task, but since it was also part of the processes that get killed on restart - I could not use it. I have a simple shell script, and I needed no additional complexity, what could I do ? ... a little more digging on stackexchange led me to a simple solution.

Schedule a job to be executed at a later time (or immediately) using at!

So, my solution boiled down to a very simple change, instead of restart, I had to say

restart | at now 

wow!

Sunday, April 27, 2014

Merge Excel sheets ?

No problem!

A friend of mine had a problem, she had to merge two [huge] Excel workbooks, by matching names in one, with the names in another. Something like:

WorkBook1
+---------+---------+----------+----------+
| Name1   |  field1 |  field2  |  ...     |
+---------+---------+----------+----------+
|  ...    |         |          |          |
+---------+---------+----------+----------+
|  ...    |         |          |          |
+---------+---------+----------+----------+

WorkBook2
+---------+---------+----------+----------+
| field2_1| field2_2|  Name2   |  ...     |
+---------+---------+----------+----------+
|  ...    |         |          |          |
+---------+---------+----------+----------+
|  ...    |         |          |          |
+---------+---------+----------+----------+


If they were Database tables, then we could have done something on the lines of:

SELECT * FROM WorkBook1,WorkBook2 WHERE Name1=Name2;

But, these are excel sheets, and worse yet, the names in WorkBook1 are in format:
"FirstName LastName"
and names in WorkBook2 are in format:
"LASTNAME, FIRSTNAME"

(i.e, uppercase, and with comma) Duh! and there will be many names with 3-4 words - imagine the permutations.

Excel experts could say this can be solved with some cool macros, or maybe VB scripts!, but I am old-school Unix text guy!. I think in terms of text filters only!

To solve the problem of name matching, Take this [hypothetical] name:

Shankaran Ganesh Krishnan

the permutations will be:

Shankaran Krishnan Ganesh
Krishnan Shankaran Ganesh
Krishnan Ganesh Shankaran
Shankaran Ganesh Krishnan
Ganesh Shankaran Krishnan
Ganesh Krishnan Shankaran

Some names can also contain the initials [with a period], like:

Ganesh K. Shankaran

So, how can we do the name matching ? ... for a moment I thought of using a permuter
and then saving all permutations (stupid!), but that's not required!

Lets say we do the following,
 - Remove dots and commas
 - Change to lowercase (trim spaces too)
 - Sort the names by words

If you had "Shankaran, Ganesh Krishnan" in WorkBook1, and "GANESH, SHANKARAN KRISHNAN" in WorkBook2, then both will become: "ganesh krishnan shankaran"

Now, the only problem that remains, is to save the .xls as .csv, so that I can load it to Perl (Parse::CSV). Unfortunately, Excel doesn't have an option to save all the sheets in the workbook
at once to CSVs, I had to do that manually for each sheet and then merge. Other than that, its pretty straight forward.

If you are about to say: Show me teh codez!
here you go ...
What good is coding skills, if you cannot put it to use, at the right time, to help friends!? :-)


Wednesday, April 16, 2014

Recognize, and transform text


Many a times, I see some text, which is not in any known format. But, it looks vaguely familiar, or simple enough to transform. The reason I would want to transform, is of course to work with it, to load it into my scripting environment, to analyze, consume or apply some complex programmatic logic to it.

Here, I give some examples, and show the conversions. This could help in recognizing
raw text,  and transforming them to their closest [known] cousins.

Case 1

Lets start with something simple. If you see a file something like this:

Alice:
sal=20000
age=23 
role=engineer

Bob:
sal=21000
age=28           
role=engineer

and you want to load this into your preferred programming environment (like a
Python dict, Lua table, or a Perl hash) to work with it. As it stands, it is not in a format that is directly usable!, but if we can make a small change in the data, say change the line containing colon in the end, to [line].

[Alice]
sal=20000
age=23 
role=engineer
  
[Bob]   
sal=21000
age=28           
role=engineer

Now, this is a valid .ini file format (popular in the Windows world). And, there are libraries
for most languages to load and work with INI files!

What you need, is a little Perl, or sed regex to convert from the former to the latter!. And
dont think about Jamie's popular quote and be afraid (for such simple cases, regex is a
good fit, but make sure you really understand regexes to weild one when needed)

Case 2

If you have seen some router configs (like JUNOS config), or some BibTeX entries, then the following
will be faimilar:

interface {
    eth0 {
      ip4  10.1.1.2;
      bia  aa:11:22:11:00:11;
    }
}

Again, this may not be directly loadable into your environment, but see this again, doesnt it look
close to JSON ?, all you need to do is to ensure that the keys and values are quoted correctly

As JSON:

{
 "interface" : {
    "eth0" : {
      "ip4" : "10.1.1.2",
      "bia" : "aa:11:22:11:00:11"
    }
 }
}

Or, Lua table:

interface = {
    eth0 = {
      ip4 = '10.1.1.2',
      bia = 'aa:11:22:11:00:11'
    }
}

again, both of these can be achieved with minimal changes.

Case 3

This might look very similar to Case 1, but observe the nesting and a richer data set!

[Alice]
sal=20000
age=23
role=[current=engineer;previous=DevOps,TAC]

[Bob]  
sal=21000
age=28          
role=[current=engineer;previous=]

Now, converting this to .ini doesn't seem to fit!, can we convert it to something else? say, I do this:

Alice:
  sal: 20000
  age: 23
  role:
        current: engineer
        previous:
                 - DevOps
                 - TAC

Bob:
  sal: 21000
  age: 28
  role:
        current: engineer
        previous:

Aha, now this is valid YAML! YAML, like JSON, is also fat-free-XML! and you have libraries
for all languages to load and work with YAML.

Case 4

We all know CSV, if you have seen simple spread-sheet data (think MS Excel), that's valid CSV. Also, spread-sheet editors give you an option to save it as plain CSV.

But what if the data were like this:

 a:b:c:"d"
or
 a|b|"c"|d

isn't it simple to change the delimiter to comma (',')? so that, you can work with CSV libraries.
Bonus - if you have to send the data to a suit, just attach it and they can open in a spreadsheet-editor! you know, suits frown on plain text attachments! :-/

Note: the regex should be careful enough to handle quoting! (that applies to all cases listed above)

To summarize, you don't need complicated parser to load text into your favorite language, to analyze it, or to apply programmatic transformations to it. All you need, is to recognize the format, and check which is the closest known format to which you can convert it to, so that you can conveniently work with it. The following table might make it easier to remember:

       
Text Easily converted to
Delimited (line oriented) CSV
Grouped, and simple key-value INI
Indented, multi level, with lists YAML
Brace nested, and key-value JSON/Py-dict/Lua-table

Tuesday, April 08, 2014

FIGlets ?

I had used Unix banner many times, but I had never bothered to check, how other cool looking typefaces were generated. Most often, on starting up some open source server/daemon, you'd come across a banner like:

                 ____                                   
 _ __ ___  _   _|  _ \  __ _  ___ _ __ ___   ___  _ __  
| '_ ` _ \| | | | | | |/ _` |/ _ \ '_ ` _ \ / _ \| '_ \ 
| | | | | | |_| | |_| | (_| |  __/ | | | | | (_) | | | |
|_| |_| |_|\__, |____/ \__,_|\___|_| |_| |_|\___/|_| |_|
           |___/                                        

Though I was sure these were not manually typed on an editor!, I never probed much. For some reason, today I wanted to put one such banner for my daemon, at start-up. So, on some Google digging, I found the source - FIGlet fonts. 

But no need to install the figlet utility, instead, try this web app - TAAG (Text to ASCII Art Generator). And, if you are working on an application - add a FIGlet banner ;-)

--EDIT-- (30-apr)

After playing around and having fun with FIGlets, I learnt about TOIlet :) (now, wait, hold your imagination), its FIGlet + filters, and how colorful!.

Look at the project page, its much more than just colorful banners!

Monday, February 17, 2014

Colorizing GDB backtrace

Being in a startup, we get to see a lot of core dumps, every day ;-). With too many stack frames and long file+function names, I hate the output!. In fact the visual clutter is so much, that it takes me long time to compare two tracebacks! I was thinking what can be done about it - hit upon gdb hooks in a SO post. Put together with my all time fav - Perl, to color the output, got some nice colorful tracebacks! ;-). The following is what I put in my ~/.gdbinit
shell mkfifo /tmp/colorPipe.gdb
# no pause/automore 
set height 0
# dont break line
set width 0
define hook-backtrace
        shell cat /tmp/colorPipe.gdb |  ~/highlgdb.pl &
        set logging redirect on
        set logging on /tmp/colorPipe.gdb
end
define hookpost-backtrace
        set logging off
        set logging redirect off
        shell sleep 0.1
end
define hook-quit
        shell rm /tmp/colorPipe.gdb
end
define re
   echo \033[0m
end

And the Perl script (highlgdb.pl) to do the highlight:

Try it!, at least identifying the culprit will be much quicker, if not fixing the issue! ;-)

Tuesday, February 04, 2014

Using API docs as a test aid

I wrote about my custom documentation using Lua embedded in my C source. With the complete description of the function - its arguments [and their type], return type etc, can we do more with it?
I wanted some test stubs for the Lua APIs (that I am defining from my C back-end) and all I had to do, was re-define the mydoc function which will consume the same data, but instead of generating documentation, generate some Lua code to test the APIs.
i.e, if I have a function that takes a number and returns a string (on success) and nil on failure, I generate:
function Stub.test (n)
  assert(type(n) == 'number', 'n should be a number)
  if g_nil_path then return nil end
  return 'rand-str'
end 

With this, I can see and get a feel of the Lua APIs that my C functions are exposing, also I can invoke the user scripts just to check the argument types for correctness, and I can exercise the success path as well as the failure path by setting the global g_nil_path. And of course for more randomness, I use 
math.random
 when I return a number and
string.rep(string.char(math.random(65,90)), math.random(1,10)) 
when returning strings.

Saturday, January 18, 2014

Using bash for a simple 'Domain Specific Micro Language'

I had bash scripts to manage some rules (like ACL), and I had it all in a single script, which kept growing with the addition of new rules. At one point a senior colleague of mine, suggested to split/separate them out into separate manageable files. I thought it was time to move the script to Perl, but then I thought, let me give bash one more shot! and I had some fun [after a long time] with bash.

I came up with this simple approach to model the rules, a file with rule and end keywords and all the required data, as key-value pairs within, like:

rule
   object=obj1
   access=any
   action=allow
end
rule
   object=obj2
   access=write,execute
   action=deny
end

And then, a processor script to convert files like these to actual rules (suitable for consumption in our system). The skeleton of the processor script:
do_process() {
 for k in "${!vartbl[@]}"; do
  # use $k and ${vartbl[$k]}
 done
}
declare -A vartbl

split_kp_re='([a-zA-Z0-9_-]+)=(.*)'
rule_id=0
while read i; do
    if [[ $i =~ '# ' ]] ; then continue; fi # skip comments
    case $i in
        'rule')
           rule_id=$(($rule_id + 1))
         ;;
        'end')
           do_process  
           vartbl=()
         ;;
        *)
           [[ $i =~ $split_kp_re ]]
           k=${BASH_REMATCH[1]}
           v=${BASH_REMATCH[2]}
           vartbl["$k"]=$v
         ;;
    esac
done < $infile

Where $infile is the input file (can be read from command line).
As can be seen, the function do_process  which shall process one rule at a time, can use the values from the associative-array vartbl and write it out in any required format, and our files can have comments too! (lines beginning with a #) :-)
Note: bash 4.0 or later required for associative array support.

Wednesday, October 30, 2013

The FizzBuzz of scripting

I have read a lot of articles on codinghorror, joelonsoftware and the like, about the FizzBuzz test, but somehow when I have to actually take a F2F interview, and the candidate is like 6-8+ years experienced, I'll have some hesitation to actually start with FizzBuzz. Instead, I start with a casual conversation about the past work s/he has done and see if I can probe more on any of the keywords from their resume. Recently, I had to take an interview of an engineer for QA role, and all that I was asked to check was How good is he in Perl, now, I don't want to get into a debate on Perl, its TIMTOWTDI philosophy etc. Somehow after the first few minutes of talking, this idea struck me to ask him about a variant of FizzBuzz - I will term it as The FizzBuzz of scripting, i.e,

Split an input [text] file on the following criteria:
if line matches foo write it out to /tmp/foo.out
if line matches bar write it out to /tmp/bar.out
if line matches foo and bar write it out to /tmp/foobar.out

The candidate started out like: "I can use multiple greps, and write a simple shell script..". Then I added a little more constraints: has to be fast, input is large, and I want a more
robust and maintainable script.
The initial Perl script he came up with, using regex was fine, then I wanted a little more tweaking, so I started asking about the points in code, where optimization was possible, at some point he gave up that no optimization was possible.

All I was looking for, was the simplest change - that you can compile the regex once and then match many-times over!

/.*mypattern$/ and print; # Does a compile and match

This will not be obvious to many programmers who haven't tried different flavors of regexes and in different languages, that, applying a regex is a two stage process:
  1. Compile
  2. Match
$re = qr/.*mypattern$/; # Compile..
/$re/ and print; # Match

Usually the compile is compute intensive, and if the pattern isn't going to change, then it’s best to compile once and store. Some languages clearly separate the steps, like

re.compile ()
..
re.match()

Of course this is how we would do it in C using POSIX regex as well
regcomp()
..
regexec()

The point I am trying to make, is that certain things are not obvious. Being polyglot'y helps, if not, a lot of reading. Maybe I am belabouring the obvious, but sometimes the obvious isn't too obvious! :)

Friday, June 21, 2013

A simple documentation generator

No, I don't want to create yet-another documentation-generator like Doxygen, there are lots of them, each with their specific purpose, and strengths. But mostly (at least AFAIK), all of them are intended to be embedded in the host language, and to document the APIs of the same language. I.e, if one is writing some C code, and wants to document the APIs of the same C code - all is well!. But I wanted something different (ah! as usual!).
I am embedding Lua into our application, and I am writing libraries to be used by Lua, in C. And, I want to document those APIs [of Lua], which I am building, by adding the usual doxygen-like comments in the source files.
I have 2 choices at this point, either I define some comment format/tags (like @something \something-else) and then extract them out of the source, then parse and generate the docs), Or ... use something which is well suited for data-description. Aha!, I'm anyway learning/using Lua, which has its roots in data-description, why not embed Lua code into my C comments?
Of course!
All I need is, a brief description of what the function does, its arguments and return types, at the
minimum, I can capture it, in Lua, like:
   mydoc {
       name = "str_len", -- function name
       args =  { string = 's' }, -- input arg and its type
       ret = { len = 'i' }, -- return val and its type
       brief = "return the length of input string",
   }

And a way to quickly fetch it from the source - make the comment begin with '*!' (or something similar)
/*
 *!   mydoc {
 *!       name = "str_len", -- function name
 *!       args =  { string = 's' }, -- input arg and its type
 *!       ret = { len = 'i' }, -- return val and its type
 *!       brief = "return the length of input string",
 *!   }
 */
And how to convert this to doc ?  simple!, just grep out these comments from the C source, and then define the mydoc function in Lua:
  function mydoc (t) -- excluding err handling and formatting
    if t.name then
        print (libname .. ":" .. t.name , "\n") -- libname also from source
    end
    if t.brief then
        print (" -- ", t.brief, "\n")
    end
    if t.args then
        print ("\tTakes " .. tablelength(t.args)  .. " arguments")
        describeargs(t.args) -- writes the values and their types
    else
        print "\tTakes no arguments\n"
    end
    if t.ret then
        print ("\tReturns " .. tablelength(t.ret)  .. " values")
        describeargs(t.ret)
    end
  end

Run all of this (the mydoc function, and the extracted blocks) together in Lua, and there you go! Now, generating text doc was so easy, how difficult is it to generate Markdown?  I added both, as options (to the bash wrapper, which extracts the comments and runs Lua on it) :-)

Tuesday, January 08, 2013

Presentations on a text editor!?

My teammate was making fun of me (for favoring all things text!) and joked that I should be making presentations/slides in text as well, and present with vim! :)
While this was a joke, I was pondering over the idea, that someone must have already tried this, given the plethora of scripts available for vim, and indeed I found one which works well- presen.vim.
The text that you'd write will be in Markdown, i.e, you can compose your usual structured text, with sections, sub-sections etc, and presen.vim will convert this into a presentation.
Combined with the DrawIt vim plugin (to put some neat diagrams, rectangles, I could get started off with making presentations on vim!), it works, and its cool :)
While this might save you from PowerPoint Hell, you might land up in PlainText Hell! :)

And BTW, Markdown in itself is cool, I've been composing all my text (README files etc) in Markdown these days. You can download the package from the author's website, and there are many Markdown editors available for windows.

--Edit: 9-May-2013--
I found TPP much better than the vim presentation!, its in ruby, and has some funky PowerPoint features - like text scroll from left/right/top etc :). Try it!

Sunday, December 16, 2012

Edit/open files quick


As developers, we edit lot of source files, and each has a specific extension
and mostly we either type them or let the shell do the auto-completion.
Most of the source files that I work on, follow this long_underscore_separated_names convention, and I hate to even press TAB TAB TAB...
When I know a file name, or am seeing it on console (with the previous ls), I want to be able to open a file (in the current directory) with as few keystrokes as possible.

I came up with a small Perl script, which, when combined with shell aliases
can be a real time saver (and save you from RSI as well! :) )
The same effect can be achieved with sed/awk or any other regex-well-supported
languages, but Perl is my favorite.

Approach

Given a file like: my_lisp_parser.c,  I want to be able to call it as
m l p or just mlp (first letters of the words, assuming file names separated with
underscore, or hyphen)
The extensions, lets not worry for now...
So, how do we go about?
The regex way...
build a regex from these characters and then do a quick look-up in the current dir:
like, m_l_p.
How to build this regex:
simple, split the input if its one word, or concat if its multiple words, but
before joining, add [^_-]*, that's the equivalent of our something above.

What to do in case of ambiguity, i.e, if we have a file like
my_lex_print.c (some hypothetical name, duh!)
then, using mlp will have a problem, so we should say
m le p
where le is the disambiguator.

So, now we can edit some files, with some extensions, what about those with
no extension ? usually I edit a lot of Makefile, that, I can handle as a
special case.

The extension is given as an argument to the script, but while using this
script, we do not invoke it directly, but with shell aliases, therefore
typing the extension will not be necessary. I use bash BTW, so here are my bash aliases (for most commonly used file extensions)

alias vc='~/short_open.pl c'
alias vh='~/short_open.pl h'
alias vp='~/short_open.pl p[ly]'
alias vy='~/short_open.pl yml'
alias vd='~/short_open.pl diff'
alias vs='~/short_open.pl sh'
alias vl='~/short_open.pl l[ou][ga]'
alias vm='~/short_open.pl mk'
 

I can open a .c file like vc mlp, or a .h file like vh mlp

The full script is here:

$editor = shift;
$ext = shift;

if ( scalar(@ARGV) == 1 ) {
    $in = shift;
    if ((length($in) == 1) and ($in eq 'm')) {
        system("$editor [Mm]akefile");
        exit 0;
    }
    if ((length($in) == 1) and ($in eq 'R')) {
        system("$editor README");
        exit 0;
    }
    if ((length($in) == 1) and ($in eq 'w')) {
        system("$editor wscript");
        exit 0;
    }
    @p = split( //, $in ); # 1 arg - pick and split each char
}
else {
    @p = (@ARGV); # multi arg - use them as is
}
$res = '^';
$res .= join( '[^_-]*[-_]', @p );
$res .= '[^_-]*[.]';
$res .= $ext . '$'
$re = qr/$res/;
foreach $a ( glob("*.$ext") ) {
    if ( $a =~ m/$re/ ) {
        print " [$a] \n";
        system ("$editor $a");
        exit 0;
    }
}

# If it were not really a long name with underscore, then try
# a plain short name (assume 1 arg)
if ( scalar(@ARGV) > 1 ) {
    exit 1;
}
$res = $in;  
$res .= '.*[.]'# deliberate unbounded match so that any part
$res .= $ext;     # of the file name can be input
$re = qr/$res/;
foreach $a ( glob("*.$ext") ) {
    if ( $a =~ m/$re/ ) {
        print " [$a] \n";
        system ("$editor $a");
        exit 0;
    }
}