Deciding to rewrite getaddrinfo in rust

After reading about the newest glibc vulnerability, I have decided to see how much effort there is in rewriting parts of glibc in a safe language. Rust is well suited for this as it should prevent the kinds of buffer overflows that caused this problem. So where to start. So first order of business is to get a copy of the current implementation of getaddrinfo from glibc

git clone git://sourceware.org/git/glibc.git

You will find the definition of getaddrinfo in

sysdeps/posix/getaddrinfo.c

it starts at line 2324 and goes for about 300 lines. All in all not too bad.

Program flow

Setup

getaddrinfo starts by sanitizing the name and service inputs. it will treat a NULL pointer and a string consisting of "*" as the same, so here it replaces "*" with NULL ( fun aside, name and service are both const char* so I find it funny that they are set in the program, I do understand that from the callers perspective they don't change, but still bad form ).

Sanity checks

The next step is to check the flags against the list of allowed flags. First just check to see if a bit was set that shouldn't be, then check that you have combinations that make sense including

  • if you pass in AI_CANONNAME make sure you passed in a name
  • if you want IPv6 make sure you have an IPv6 address
  • if you want IPv4 make sure you have an IPv4 address

Next we start dealing with the service. You need to check if service is a number, if its not that's cool unless you set AI_NUMERICSERV in which case we need to error out. After this we really get into the heart of the function

Resolving Interfaces

So after we have passed the sanity checks control is passed into gaih_inet, this is the function that powers getaddrinfo. It also happens to have a ton of goto's and very unhelpfully named variables. After this function does some formatting of data structures it looks to call out to __getservbyname_r ( inside a nested helper function gaih_inet_serv ) and __gethostbyname2_r with those data structures and the process the results, under some circumstances it will also engage NSS to do some lookup using NSS verisons of gethostbyname. Then it looks like the function try's to connect to the services on the given port ( discovered from getsrvbyname_r ) and returns a list of connections.

Formatting and Sorting

After the actual data is gathered the next step is sorting it. There appears to be a few places where the data is sorted, first a list of local IPv6 interfaces is sorted, this is used later to determine if we connected via a "temporary or disabled" interface.
The next time sort appears we are sorting the results according to RFC3484. The final step is to set the results, which is done with a double assignment operation.

q = p = results[order[0]].dest_addr; q = q->ai_next = results[order[i]].dest_addr;

ಠ_ಠ

Implementing in Rust

To re implement this I think I will break it down into a few separate components ( much like glibc ) Look for follow up posts for each of these components.

  • Implement a get all local interfaces function ( __check_pf i.e. getifaddrs wrapper )
  • Implement a getservbyname_r function
  • Implement a gethostbyname_r function
  • Implement a connect to host on given port function
  • Implement a demarshaler / marshaler into the C data structure

At the end I want to benchmark against the libc version to see what the slowdown is, and I want to classify which parts of glibc my version uses to see where else things need to be implemented to truly have a glibc replacement of the function

Stay Tuned.

comments powered by Disqus