Code Rants: C

Showing posts with label C. Show all posts

Monday, November 30, 2015

Common config that can be utilized by multiple languages

Often, I would want to separate out some config/initial data out of a program, and keep it in a way that is accessible to different parts of the system. If all parts of the system are in the same language, then the config can be very simple - just the function invocation with values:

For e.g. in shell:

# Format:
# Employee name role department
Employee "Bheem" "Fighter" "Security"
Employee "Chutki" "Hiring" "HR"

Then, the invoker would just have to define the function Employee and source this file!

So far, so good if all parts were in shell. But, what if I want to use this config in some C code, and in Lua scripts etc. ?

(Don't think file-read/parse/split/tokenize .. No no! yuck!)

Can we define (somehow), a common format which is usable by different environments (languages), with minimal or no effort!? Observe the calling conventions:

C function call looks like:

Employee (arg1, arg2 /* ... */)

Shell (is the simplest):

Employee "arg1" "arg2" # ...

Lua:

Employee (arg1, arg2) -- ...

But, there is a special notation in Lua, when the argument is a string or a hash table - we can get rid of the parentheses! like:

Employee "arg1"

OK, but what if we have more than 1 argument ? Now, we need to get a bit deeper into function-chaining:

function Employee (name)
return function (role)
return function (department)
add_to_db(name, role, department)
end
end
end

Without going into much details about anonymous functions, Lexical scoping, upvalues etc - the function returns a function, which returns yet another function - that means, when you invoke

Employee "Bheem" "Fighter" "Security"

The first function reads the first value, and returns its inner function, which reads the second value and that in turn returns another function which finally does some work (and can access all the 3 values - this is what we want)! Cool, ain't it? :-)

Wait, we solved the problem for shell and Lua, what about C code ?

In case of C code, we can always use the well-documented Lua C APIs to read Lua config! :)
(after all, Lua evolved from data-description language)

If you think, all this is too much work - think again, know the DRY principle.

Saturday, November 01, 2014

Editing ad-hoc config files on Linux

I had a need to edit /etc/network/interfaces file on Ubuntu (from my program). At first I found some awk script on the web, which claimed to do the job, but when I tried, it wasn't addressing all the CRUD cases that I am interested in (also, I didn't want to have system() calls from my C code).
So, I searched for a better utility, and found Augeas - this is fantastic! you have to try it to believe! It has a neat command line utility (augtool) as well as an easy to use C API (+ bindings for many scripting languages, including my fav Lua too :-) ).

The following commands show how easy it is to add/rm an interface (XPATH expressions)

$ sudo augtool # Add an interface at the end (last)
augtool> set /files/etc/network/interfaces/iface[last()+1] eth1
augtool> set /files/etc/network/interfaces/iface[last()]/family inet
augtool> set /files/etc/network/interfaces/iface[last()]/method static
augtool> set /files/etc/network/interfaces/iface[last()]/address 10.1.1.1
augtool> save
Saved 1 file(s)
augtool> 


$ sudo augtool  # Edit the added interface (by name, not position)
augtool> set /files/etc/network/interfaces/iface[. = 'eth1']/netmask 255.255.255.0
augtool> save
Saved 1 file(s)
augtool> set /files/etc/network/interfaces/iface[. = 'eth1']/mtu 1500
augtool> save
Saved 1 file(s)
augtool> 

$ sudo cat /etc/network/interfaces 
auto lo
iface lo inet loopback
iface eth1 inet dhcp
   address 10.1.1.1
   netmask 255.255.255.0
   mtu 1500

$ sudo augtool # Lets just delete eth1 now
augtool> rm /files/etc/network/interfaces/iface[. = 'eth1']
rm : /files/etc/network/interfaces/iface[. = 'eth1'] 6  <-- 6="" augtool="" fields="" removed=""> save
Saved 1 file(s)
augtool>

Now, the same/similar exercise programmatically:

Tuesday, February 04, 2014

Using API docs as a test aid

I wrote about my custom documentation using Lua embedded in my C source. With the complete description of the function - its arguments [and their type], return type etc, can we do more with it?
I wanted some test stubs for the Lua APIs (that I am defining from my C back-end) and all I had to do, was re-define the mydoc function which will consume the same data, but instead of generating documentation, generate some Lua code to test the APIs.
i.e, if I have a function that takes a number and returns a string (on success) and nil on failure, I generate:

function Stub.test (n)
  assert(type(n) == 'number', 'n should be a number)
  if g_nil_path then return nil end
  return 'rand-str'
end

With this, I can see and get a feel of the Lua APIs that my C functions are exposing, also I can invoke the user scripts just to check the argument types for correctness, and I can exercise the success path as well as the failure path by setting the global g_nil_path. And of course for more randomness, I use

math.random

when I return a number and

string.rep(string.char(math.random(65,90)), math.random(1,10))

when returning strings.

Friday, June 21, 2013

A simple documentation generator

No, I don't want to create yet-another documentation-generator like Doxygen, there are lots of them, each with their specific purpose, and strengths. But mostly (at least AFAIK), all of them are intended to be embedded in the host language, and to document the APIs of the same language. I.e, if one is writing some C code, and wants to document the APIs of the same C code - all is well!. But I wanted something different (ah! as usual!).
I am embedding Lua into our application, and I am writing libraries to be used by Lua, in C. And, I want to document those APIs [of Lua], which I am building, by adding the usual doxygen-like comments in the source files.
I have 2 choices at this point, either I define some comment format/tags (like @something \something-else) and then extract them out of the source, then parse and generate the docs), Or ... use something which is well suited for data-description. Aha!, I'm anyway learning/using Lua, which has its roots in data-description, why not embed Lua code into my C comments?
Of course!
All I need is, a brief description of what the function does, its arguments and return types, at the
minimum, I can capture it, in Lua, like:

mydoc {

name = "str_len", -- function name

args = { string = 's' }, -- input arg and its type

ret = { len = 'i' }, -- return val and its type

brief = "return the length of input string",

}

And a way to quickly fetch it from the source - make the comment begin with '*!' (or something similar)

*! mydoc {

*! name = "str_len", -- function name

*! args = { string = 's' }, -- input arg and its type

*! ret = { len = 'i' }, -- return val and its type

*! brief = "return the length of input string",

*! }

And how to convert this to doc ? simple!, just grep out these comments from the C source, and then define the mydoc function in Lua:

function mydoc (t) -- excluding err handling and formatting

if t.name then

print (libname .. ":" .. t.name , "\n") -- libname also from source

end

if t.brief then

print (" -- ", t.brief, "\n")

end

if t.args then

print ("\tTakes " .. tablelength(t.args) .. " arguments")

describeargs(t.args) -- writes the values and their types

else

print "\tTakes no arguments\n"

end

if t.ret then

print ("\tReturns " .. tablelength(t.ret) .. " values")

describeargs(t.ret)

end

Run all of this (the mydoc function, and the extracted blocks) together in Lua, and there you go! Now, generating text doc was so easy, how difficult is it to generate Markdown? I added both, as options (to the bash wrapper, which extracts the comments and runs Lua on it) :-)

Sunday, January 23, 2011

Ruminations on 'ld', ELF and entry point to a C program

Q. Can a C program start at a function other than main()?

To answer this question in depth, one needs to know the C
run-time environment, i.e, at least the basic difference between a

Hosted environment (where all C standard libraries are available, program starts at main()). E.g. GNU/Linux, Windows.
Freestanding environment (no libraries available, how to start/load is up to the environment). E.g. Embedded systems.

(for this discussion, lets consider only GNU/Linux and the GCC compiler tools)

Now, when we do a
$ gcc <file>.c

it goes through all the stages of compilation and linking to yield an executable (in ELF format) in that process, the default entry-points to the program is defined and that will be main() (crt0.o/crt1.o etc which GNU linker [ld] links).

If we want to establish a separate entry point, we have to use the linker option (to ld), and that is -e.

And, if we have to mimic a total freestanding implementation, we will need lot of functions to try out this simple exercise, instead, lets use the stdio from libc, and change the entry point to

start()

.

$ cat tmp.c
#include <stdio.h>

int start()
{
printf ("Hello World.\n");

exit(0);
}

Note: we use exit instead of return. We cant return, because we wont link with the C run-time which has the handlers. Here, we need stdio.h (libc) for both exit() and printf().

Lets compile it:
$ gcc -c tmp.c

we got tmp.o, to link this and get a.out, we need the path to the run-time dynamic linker (which is the path to ld-linux), on my RHEL, it happens to be:
/lib64/ld-linux-x86-64.so.2
(since I have a 64-bit AMD, if you want to find out, just run gcc -v on any program and see the link stage output)

Note: ld is the GNU linker, ld-linux is the dynamic linker (or loader) which the kernel first loads, and is responsible for loading the actual executable and all the required dynamic libraries.

Link:
$ ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -e start tmp.o -lc

Here, -lc is to link libc, once this runs, we get a.out, and we're done. We can examine a.out using ldd.

$ ldd a.out
libc.so.6 => /lib64/tls/libc.so.6 (0x000000328b700000)
/lib64/ld-linux-x86-64.so.2 (0x000000328b300000)

If we don't use the -dynamic-linker option to ld, it picks default (/lib/ld64.so.1 for me)

At this point, we are done with our agenda of changing the entry point, but lets go a but deeper on ld-linux - the dynamic linker (part of OS) which actually loads the executable.
Why did we have to give the path to ld-linux in the link step ? That's because, the dynamic-linker is a separate binary, and maybe for modularity reasons, is not within in the kernel, when we give the path to dynamic-linker to ld, it will be added into a section called .interp in the ELF headers. When the ELF binary is run, the loader looks for the path to ld-linux in the .interp section, first loads it, and then hands over the program to ld-linux (we can see the ELF headers with the objdump or readelf utilities).

This is similar to shebang lines in script executables, i.e, if the first line of a shell (bash) script is:
#!/bin/bash
call it hello.sh, chmod +x and execute it, the loader does something similar to:
/bin/bash hello.sh
i.e, load the shell and hand over the script to it. We can as well try this on any dynamic executable (which are dependent on shared libraries)

$ /lib64/ld-linux-x86-64.so.2 /bin/echo hello
hello

References:

LD man page
ELF: see .interp section

Code Rants