Code Rants

Tuesday, February 12, 2019

REPL for C!?

Was instructing a colleague on how to use REPL first, when working on languages like Python/Lua, to try the code snippet interactively. And this thought passed through - "why not a simple REPL for C"? And, that's how I got to hack together a quick shell (bash) script which does just that - a REPL for C!
Of course, C is not interpreted or dynamic, and you cannot do many things that you could do in the REPL of a dynamic language. But, this does save some keystrokes when you want to quickly run and check some code snippet.

The script:
https://gist.github.com/aniruddha-a/2877defd751b20a8f4dbbe04b7b6fe43

The readline helper, with history and completion for often used std functions/types:
https://gist.github.com/aniruddha-a/dfca2c5f88fdc1df5591943422fe0349

Wednesday, September 20, 2017

How does Checkinstall work

Today, while trying to build and install some software from source, I discovered
checkinstall
What it does: tries to get what make install does, and then creates a Debian (or rpm etc)
package. At first it might sound like Ah! not much, but think again ...
Makefile can have any kind of recipes, invocation of other shell commands, loops ...
how the #&$% can it know!?

I could not rest until I understood how this cool thing works! As usual, I tried to first Google if I can find some articles, but didn't find any. OK! no problem, I downloaded the sources and grok'd through it.

Here is a brief explanation:

- checkinstall relies on a utility called installwatch
- installwatch traps quite many libc calls like open() chown() unlink() etc and logs the operations in a parseable form
- This info is used by a shell wrapper to translate it to commands/formats for different package managers

Thursday, August 24, 2017

Enable tracing on bash scripts

With shell scripts, it becomes difficult at times to debug which command took too long, or at which step has an issue. Even though I add generous amount of debug logs there are times where -x (execution trace) is invaluable. But I don't want to litter my regular logs with the -x output either!
So, after a bit of search found this nice snippet of bash:

trace_file() {
exec 4>>"$1"
BASH_XTRACEFD=4
set -x
}

With this function defined, I can invoke it in any script, with the path to trace file:

trace_file /tmp/mytrace

Monday, October 10, 2016

How (not) to write a parser

One of my favorite subjects during college days was Parsing and compiler construction (remember The dragon book?). Ever since the first time I tried lex/yacc (flex/bison) it was a mysterious piece of software that I wanted to master using!

Zoom forward 10 years, a lot has changed in the field: LR(1) is passé, GLR and LL(1) are in-vogue (thanks to tools like ANTLR)

When I thought about venturing into writing a parser for the YANG modelling language (RFC 6020), I first did some study+survey, and I formed some new opinions about parsing:

- Don't roll out a RDP in desperation

- No separate tokenization phase (scanner-less)

- Not LR(1) - too limited

- Not ANTLR! (though its a fantastic tool - it leans a lot towards Java!)

- Not Bison GLR! (no proper documentation!)

- Use PEG (yet another learning opportunity :-))

- Not C! (more about it later)

PEG is cool, and with a love for regex PEG attracted me way too much :-) I tried many PEG based parser generators.

To start with I picked up Waxeye , though it has a choice of many targets, the C port is not optimal (author himself told, that more work's needed).

Then I tried peg/leg, though its very similar to lex/yacc in usage, I was finding it hard to debug/modify the grammar. Part of the problem could have also been the YANG ABNF grammar (as defined in the RFC). There are no reserved words in YANG. I tried peg/leg much more than Waxeye, but eventually I gave up once I realized that I was getting nowhere! So, C is gone - as I had only 2 options that had a C target!

Lua/LPeg: Though I'm familiar with Lua, this was my first tryst with LPeg. I had heard a lot of praise for Roberto's work, and I got to know why, quite soon. Within a few hours I was able to create a matcher (deliberately avoiding the word parser here) that could recognize 80% of the YANG files at my disposal! (had to tweak the matcher for the remaining 20%). This time, I chose a different approach, instead of using the YANG ABNF grammar, I created some simple relaxed rules.

Here is a brief description of the approach taken.

YANG statements can be either of the form:

1. identifier [value];

Or:

2. identifier [value] {

One or more of 1. (depending on identifier)

}

Note that value is optional. Since there are no reserved words, it leads to a simple matcher (~30lines!). Even though there are no reserved words, we have some keywords, and based on that we should be able to say whether the nesting is right or wrong – I call this as the validation phase. And, that too, can be simplified to triviality, if we keep the AST as a simple Lua array (table).

For e.g: A leaf can be under a container, but not vice-versa:

Valid:

container C1{

leaf l1 {

type string;

}

Invalid:

leaf l1 {

container C1 {

}

I keep a list of allowed-children for each type of node. Additional checks are performed with custom functions per node-type. And, since these children are arranged as arrays – we can also ensure order.

That’s it! :-) - take a look at: https://github.com/aniruddha-a/lyang

Comments/feedback welcome.

Monday, December 14, 2015

Oh! But how would I know without 'running' it, if its ok?

Have you come across your teammates, checking in scripts without running it, with syntax errors, malformed XML documents, broken Makefiles .... well, there is some hope, you could help (by sharing this link! :-) )

Mostly, if its code in a compiled language (like C), developers at least compile the code (might not test!) before checking into version control. But, in case of XML documents, scripts etc, the code gets checked in, and it will not error out until some test case is completely run, or in the nightly regressions!

Its not really difficult to do some basic validation of scripts, Makefiles, or even XML documents before a check-in. Most languages/tools have an option to do basic checks - syntax checks, dry run (without actually executing the commands), bytecode generation etc. Here are some, which I often use:

Make

make -n

Does a dry-run, shows which targets would get built on an actual run.

Perl

perl -c

Does a compilation and says if syntax is OK or not (there are caveats with BEGIN blocks, but at least there is something!)

Python

pycompile

This is good, there is a separate utility to do the compilation! (just run pycompile on your scripts once before you proceed)

bash

bash -n

Again, syntax checks without actually executing anything.

XML

xmllint

The name says it all! Even though there are plenty of options, you could just run once [without even options] on your XML docs to see if its well-formed.

Lua

luac

Lua bytecode compiler, like pycompile. Generates luac.out which you can delete :-)

Monday, December 07, 2015

Recover a Locked Android without data loss

Note3Recover

Since I had to recover a Samsung Note 3, all the steps mostly lean towards Samsung phones.

Scenario

Password forgotten! (you may think that this cannot happen to you, but it surely can! :-) )
Android Device Manager password reset doesn't work (and it didn't BTW! at least I could give this a try as WiFi was on)
Phone not registered with Samsung Recovery (good tha they have an alternate to ADM!, unlike ADM, you need to explicitly register your device - the one I was handling wasn't registered)
Un-rooted phone, with stock Android recovery
USB debugging disabled!

For the impatient

TL;DR summary

Install Samsung CDC drivers
Get the right device code (this one is ha3g)
Get the right custom recovery image (TWRP or CWM)
Use a firmware flashing utility which works for you (Odin or Heimdall)
Flash the recovery image.
Boot into recovery mode, and delete the password files.

Thats it!

Details

But,... the devil is in the details!

USB drivers

Get the Samsung CDC drivers and make sure you OS detects the phone. (If you are on Windows 10, and want to connect with ADB - Samsung doesnt have ADB drivers for Windows 10!)

Device Code Name

Now, this is one confusing thing! all the custom ROMs refer/name their images by phone's code.
The Samsung Note 3, for e.g. has multiple versions - the Sprint, Verizon and International. At first glance it might seem like "Ah! I bought it in India, and its not tied to any carrier, it must be international",... sorry! not that simple - you should know the correct CPU and Model (In this case the version is ha3g and the CPU is called Exynos though its not mentioned anywhere on the box or manuals! the model is N900 - this is easy should show up when the phone starts up)

Firmware Flash utility

2 choices here:

Heimdall (FOSS, available on Github. Binaries packages for both Windows and Linux available)
Odin (Leaked [from Samsung] Windows application)

Tryst with Heimdall

Since I have a love for FOSS, I was hell-bent to get Heimdall to work, and it had a cool command-line! I downloaded and built the latest version from source! (on Ubuntu 14.04). But, it didn't work! I could not figure out what the problem was! Tried Windows binaries too, and with USB2.0 and 3.0 port (with 3.0 port it would'nt even detect the phone!)
(the good part of trying Heimdall: I got to know a little bit about partitions and PIT [Partition Information Table])

A note on USB versions: If you do not know how to recognize the ports: peep into the USB socket, BLUE means 3.0 and YELLOW is 2.0.

Odin, finally

Odin is supposedly too picky: picky about the port, the cable etc. From what I read, if in case the phone is not detected on one port, try changing to another port and try using a different cable (in my case I had the original USB cable that came with the phone, and it was a USB3.0 - it got recognized instantly)
Odin's messages and the UI aren't too friendly either!

Custom recovery image

Here, again there are quite a few choices:

ClockWorkMod (CWM) - Development Ceased
Team Win Recovery Project (TWRP) New and cool
CyanogenMod Recovery (CR)

I could not find the CWM image for ha3g (though I could not find one for hltexx - not Exynos!). CR is still new! Initially, tried with whichever version of TWRP image that I could get, but Odin wouldn't flash it! It would give this error message:

NAND Write Start!! 
FAIL!

At first I thought it had something to do with the NAND flash storage! but after a lot more research found that it could be due to the type of image being written. I had to do 2 things to be able to successfully flash:

Get the latest TWRP (2.8.6.0 recovery image, bundled as a .tar)
Extract the .img from the .tar and convert it to a .tar.md5 (I found a script on XDA forums which did this)

And, finally Odin could flash the image to mobile!

You should put the mobile to Download Mode to write to flash, and that is achieved by pressing down Volume-Down, Home and Power keys together on Samsung

Recovery mode

You need to know how to get to recovery mode first, press down: Volume-Up + Home + Power keys and hold till logo flashes (do not confuse with Download-mode!)
The catch here is: though we flashed the TWRP recovery, the phone tries to be smart and replace with stock recovery if you let it reboot by itself! The remedy is to boot into TWRP immediately after the flash! i.e., once flashed do not let it restart normally (un-check Auto-Reboot in Odin)

Ah, TWRP!

Now, this is cool! if you ever have seen the default Android recovery and then compare with TWRP!, its like comparing age old feature phones to the moden day touch phone! TWRP has touch interface and neat buttons, you can pretty much do away without reading any manuals - thanks to the neat, and simplified UI.

Play safe

First thing I did after I could get to TWRP was to insert a microSD card, and take a backup of data, so that I could continue my RnD (it takes a NANDroid backup, which it can restore).

Recover/Remove?

I tried to pull out the locksettings.db and run some SQLite SQL queries to get the MD5 encoded password, and salt. At this stage, I didn't want to go any further. So, I came back to TWRP and deleted the 2 key files /data/system/password.key and /data/system/gesture.key (though I knew it was password-locked and not gesture locked). On reboot - No password! :) Nothing lost - all contacts and data intact.

Thanks

To this post for motivation! (so many Samsung service centers told me that its not possible, and that factory-reset [with data loss] is the only option!)
To this guy for the script
For many other detailed posts, step-wise procedures and YouTube videos which Android lovers have patiently put together.
And, of course to TWRP! I donated $15 towards the development (~1K/-)

Monday, November 30, 2015

Common config that can be utilized by multiple languages

Often, I would want to separate out some config/initial data out of a program, and keep it in a way that is accessible to different parts of the system. If all parts of the system are in the same language, then the config can be very simple - just the function invocation with values:

For e.g. in shell:

# Format:
# Employee name role department
Employee "Bheem" "Fighter" "Security"
Employee "Chutki" "Hiring" "HR"

Then, the invoker would just have to define the function Employee and source this file!

So far, so good if all parts were in shell. But, what if I want to use this config in some C code, and in Lua scripts etc. ?

(Don't think file-read/parse/split/tokenize .. No no! yuck!)

Can we define (somehow), a common format which is usable by different environments (languages), with minimal or no effort!? Observe the calling conventions:

C function call looks like:

Employee (arg1, arg2 /* ... */)

Shell (is the simplest):

Employee "arg1" "arg2" # ...

Lua:

Employee (arg1, arg2) -- ...

But, there is a special notation in Lua, when the argument is a string or a hash table - we can get rid of the parentheses! like:

Employee "arg1"

OK, but what if we have more than 1 argument ? Now, we need to get a bit deeper into function-chaining:

function Employee (name)
return function (role)
return function (department)
add_to_db(name, role, department)
end
end
end

Without going into much details about anonymous functions, Lexical scoping, upvalues etc - the function returns a function, which returns yet another function - that means, when you invoke

Employee "Bheem" "Fighter" "Security"

The first function reads the first value, and returns its inner function, which reads the second value and that in turn returns another function which finally does some work (and can access all the 3 values - this is what we want)! Cool, ain't it? :-)

Wait, we solved the problem for shell and Lua, what about C code ?

In case of C code, we can always use the well-documented Lua C APIs to read Lua config! :)
(after all, Lua evolved from data-description language)

If you think, all this is too much work - think again, know the DRY principle.