September 03, 2019
August 31, 2019
Unit testing C code with gtest
This post covers building and testing a minimal, but still useful, C project. We'll use Google's gtest and CMake for testing C code. This will serve as a foundation for some upcoming posts/projects on programming Linux, userland networking and interpreters.
The first version of this post only included one module to
test. The test/CMakeLists.txt would also only expose a
single pass-fail status for all modules. The second version of this
post extends the test/CMakeLists.txt to expose
each test/*.cpp file as its own CMake test so that
results are displayed by ctest per file. The second
version also splits the original src/testy.c
and include/testy/testy.h module into
a widget and customer module to
demonstrate the changes to the CMake configuration.
The "testy" sample project
In this project, we'll put source code in src/ and publicly
exported symbols (functions, structs, etc.) in header files in
include/testy/. There will be a main.c in the src/
directory. Tests are written in C++ (since gtest is a C++ testing
framework) and are in the test/ directory.
Here's an overview of the source and test code.
src/widget.c
This file has some library code that we should be able to test.
#include "testy/widget.h"
int private_ok_value = 2;
int widget_ok(int a, int b) {
return a + b == private_ok_value;
}
include/testy/widget.h
This file handles exported symbols for widget code.
#ifndef _WIDGET_H_
#define _WIDGET_H_
int widget_ok(int, int);
#endif
src/customer.c
This file has some more library code that we should be able to test.
#include "testy/customer.h"
int customer_check(int a) {
return a == 5;
}
include/testy/customer.h
This file handles exported symbols for customer code.
#ifndef _CUSTOMER_H_
#define _CUSTOMER_H_
int customer_check(int);
#endif
src/main.c
This is the entrypoint to a program built around libtesty.
#include "testy/customer.h"
#include "testy/widget.h"
int main() {
if (widget_ok(1, 1)) {
return customer_check(5);
}
return 0;
}
test/widget.cpp
This is one of our test files. It registers test cases and uses gtest
to make assertions. We need to wrap the testy/widget.h include in an
extern "C" to stop C++ from
name-mangling.
#include "gtest/gtest.h"
extern "C" {
#include "testy/widget.h"
}
TEST(widget, ok) {
ASSERT_EQ(widget_ok(1, 1), 1);
}
TEST(testy, not_ok) {
ASSERT_EQ(widget_ok(1, 2), 0);
}
You can see a good high-level overview of gtest testing utilities like
ASSERT_EQ and TEST
here.
test/customer.cpp
This is another one of our test files.
#include "gtest/gtest.h"
extern "C" {
#include "testy/customer.h"
}
TEST(customer, ok) {
ASSERT_EQ(customer_check(5), 1);
}
TEST(testy, not_ok) {
ASSERT_EQ(customer_check(0), 0);
}
test/main.cpp
This is a standard entrypoint for the test runner.
#include "gtest/gtest.h"
int main(int argc, char **argv) {
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
Building with CMake
CMake is a build tool that (among other things) produces a Makefile we can run to build our code. We will also use it for dependency management. But fundementally we use it because gtest requires it.
CMake options/rules are defined in a CMakeLists.txt file. We'll have one in the root directory, one in the test directory, and a template for one that will handle the gtest dependency.
A first draft of the top-level CMakeLists.txt might look like this:
cmake_minimum_required(VERSION 3.1)
project(testy)
##
### Source definitions ###
##
include_directories("${PROJECT_SOURCE_DIR}/include")
file(GLOB sources "${PROJECT_SOURCE_DIR}/src/*.c")
add_executable(testy ${sources})
Using include_directory will make sure we compile with the -I flag
set up correctly for our include directory.
Using add_executable sets up the binary name to produce from the
given sources. And we're using the file helper to get a glob match
of C files rather than listing them all out verbatim in the
add_executable call.
Building and running
CMake pollutes the current directory, and is fine running in a
different directory, so we'll make a build/ directory so we don't
pollute root. Then we'll build a Makefile with CMake, run Make, and
run our program.
$ mkdir build
$ cd build
$ cmake ..
-- The C compiler identification is AppleClang 10.0.1.10010046
-- The CXX compiler identification is AppleClang 10.0.1.10010046
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/philipeaton/tmp/testy/build
$ make
[ 25%] Building C object CMakeFiles/testy.dir/src/customer.c.o
[ 50%] Building C object CMakeFiles/testy.dir/src/widget.c.o
[ 75%] Building C object CMakeFiles/testy.dir/src/main.c.o
[100%] Linking C executable testy
[100%] Built target testy
$ ./testy
$ echo $?
1
CMakeLists.txt.in
This template file handles downloading the gtest dependency from
github.com pinned to a release. It will be copied into a subdirectory
during the cmake .. step.
cmake_minimum_required(VERSION 3.1)
project(googletest-download NONE)
include(ExternalProject)
ExternalProject_Add(googletest
GIT_REPOSITORY https://github.com/google/googletest.git
GIT_TAG release-1.8.1
SOURCE_DIR "${CMAKE_BINARY_DIR}/googletest-src"
BINARY_DIR "${CMAKE_BINARY_DIR}/googletest-build"
CONFIGURE_COMMAND ""
BUILD_COMMAND ""
INSTALL_COMMAND ""
TEST_COMMAND ""
)
Now we can tell CMake about it and how to build, within the top-level CMakeLists.txt file.
cmake_minimum_required(VERSION 3.1)
project(testy)
##
### Test definitions ###
##
configure_file(CMakeLists.txt.in
googletest-download/CMakeLists.txt)
execute_process(COMMAND ${CMAKE_COMMAND} -G "${CMAKE_GENERATOR}" .
WORKING_DIRECTORY ${CMAKE_BINARY_DIR}/googletest-download )
execute_process(COMMAND ${CMAKE_COMMAND} --build .
WORKING_DIRECTORY ${CMAKE_BINARY_DIR}/googletest-download )
add_subdirectory(${CMAKE_BINARY_DIR}/googletest-src
${CMAKE_BINARY_DIR}/googletest-build)
enable_testing()
add_subdirectory(test)
##
### Source definitions ###
##
include_directories("${PROJECT_SOURCE_DIR}/include")
file(GLOB sources
"${PROJECT_SOURCE_DIR}/include/testy/*.h"
"${PROJECT_SOURCE_DIR}/src/*.c")
add_executable(testy ${sources})
The add_subdirectory calls register a directory that contains its
own CMakeLists.txt. It would fail now without a CMakeLists.txt file
in the test/ directory.
test/CMakeLists.txt
This final file registers a unit_test executable compiling against
the source and test code, and includes the project header files.
include_directories("${PROJECT_SOURCE_DIR}/include")
file(GLOB sources "${PROJECT_SOURCE_DIR}/src/*.c")
list(REMOVE_ITEM sources "${PROJECT_SOURCE_DIR}/src/main.c")
file(GLOB tests "${PROJECT_SOURCE_DIR}/test/*.cpp")
list(REMOVE_ITEM tests "${PROJECT_SOURCE_DIR}/test/main.cpp")
foreach(file ${tests})
set(name)
get_filename_component(name ${file} NAME_WE)
add_executable("${name}_tests"
${sources}
${file}
"${PROJECT_SOURCE_DIR}/test/main.cpp")
target_link_libraries("${name}_tests" gtest_main)
add_test(NAME ${name} COMMAND "${name}_tests")
endforeach()
We have to register a test for each file otherwise each file's tests
won't show up by default (i.e. without a --verbose flag).
Building and running tests
Similar to building and running the source, we run CMake in a
subdirectory but run make test or ctest after building all sources
and tests with make.
$ cd build
$ cmake ..
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/philipeaton/tmp/testy/build/googletest-download
Scanning dependencies of target googletest
[ 11%] Creating directories for 'googletest'
[ 22%] Performing download step (git clone) for 'googletest'
Cloning into 'googletest-src'...
Note: checking out 'release-1.8.1'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 2fe3bd99 Merge pull request #1433 from dsacre/fix-clang-warnings
[ 33%] No patch step for 'googletest'
[ 44%] Performing update step for 'googletest'
[ 55%] No configure step for 'googletest'
[ 66%] No build step for 'googletest'
[ 77%] No install step for 'googletest'
[ 88%] No test step for 'googletest'
[100%] Completed 'googletest'
[100%] Built target googletest
-- Found PythonInterp: /usr/local/bin/python (found version "2.7.16")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/philipeaton/tmp/testy/build
$ make
[ 4%] Building C object CMakeFiles/testy.dir/src/customer.c.o
[ 9%] Building C object CMakeFiles/testy.dir/src/widget.c.o
[ 13%] Building C object CMakeFiles/testy.dir/src/main.c.o
[ 18%] Linking C executable testy
[ 18%] Built target testy
[ 22%] Building CXX object googletest-build/googlemock/gtest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
[ 27%] Linking CXX static library libgtest.a
[ 27%] Built target gtest
[ 31%] Building CXX object googletest-build/googlemock/CMakeFiles/gmock.dir/src/gmock-all.cc.o
[ 36%] Linking CXX static library libgmock.a
[ 36%] Built target gmock
[ 40%] Building CXX object googletest-build/googlemock/CMakeFiles/gmock_main.dir/src/gmock_main.cc.o
[ 45%] Linking CXX static library libgmock_main.a
[ 45%] Built target gmock_main
[ 50%] Building CXX object googletest-build/googlemock/gtest/CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
[ 54%] Linking CXX static library libgtest_main.a
[ 54%] Built target gtest_main
[ 59%] Building C object test/CMakeFiles/customer_tests.dir/__/src/customer.c.o
[ 63%] Building C object test/CMakeFiles/customer_tests.dir/__/src/widget.c.o
[ 68%] Building CXX object test/CMakeFiles/customer_tests.dir/customer.cpp.o
[ 72%] Building CXX object test/CMakeFiles/customer_tests.dir/main.cpp.o
[ 77%] Linking CXX executable customer_tests
[ 77%] Built target customer_tests
Scanning dependencies of target widget_tests
[ 81%] Building C object test/CMakeFiles/widget_tests.dir/__/src/customer.c.o
[ 86%] Building C object test/CMakeFiles/widget_tests.dir/__/src/widget.c.o
[ 90%] Building CXX object test/CMakeFiles/widget_tests.dir/widget.cpp.o
[ 95%] Building CXX object test/CMakeFiles/widget_tests.dir/main.cpp.o
[100%] Linking CXX executable widget_tests
[100%] Built target widget_tests
After running cmake and make, we're finally ready to run ctest.
$ ctest
Test project /Users/philipeaton/tmp/testy/build
Start 1: customer
1/2 Test #1: customer .......................... Passed 0.01 sec
Start 2: widget
2/2 Test #2: widget ............................ Passed 0.00 sec
100% tests passed, 0 tests failed out of 2
Total Test time (real) = 0.01 sec
Now we're in a good place with most of the challenges of unit testing C code (i.e. ignoring mocks) past us.
In preparation for a couple new articles on some C projects, here's a foundational post on building C code and writing/running unit tests with gtest and cmake https://t.co/aMVyr7LO73
— Phil Eaton (@phil_eaton) August 31, 2019
August 08, 2019
Our focus on Speed
August 02, 2019
Learnings and results of our first User Testing sessions
July 24, 2019
Cuckoo Filters with arbitrarily sized tables
Edit: The original post did not mirror the hash space correctly (using C-... instead of (C-1)-...), thanks to Andreas Kipf for pointing this out.
July 20, 2019
Writing an x86 emulator from scratch in JavaScript: 2. system calls
Previously in emulator basics:
<! forgive me, for I have sinned >
1. a stack and register machine
In this post we'll extend x86e to
support the exit and write Linux system calls, or syscalls. A syscall
is a function handled by the kernel that allows the process to
interact with data outside of its memory. The SYSCALL
instruction takes arguments in the same order that the
regular CALL instruction does. But SYSCALL
additionally requires the RAX register to contain the
integer number of the syscall.
Historically, there have been a number of different ways to make
syscalls. All methods perform variations on a software interrupt.
Before AMD64, on x86 processors, there was the SYSENTER
instruction. And before that there was only INT 80h
to trigger the interrupt with the syscall handler (since interrupts
can be used for more than just syscalls). The various instructions
around interrupts have been added for efficiency as the processors and
use by operating systems evolved.
Since this is a general need and AMD64 processors are among the most common today, you'll see similar code in every modern operating system such as FreeBSD, OpenBSD, NetBSD, macOS, and Linux. (I have no background in Windows.) The calling convention may differ (e.g. which arguments are in which registers) and the syscall numbers differ. Even within Linux both the calling convention and the syscall numbers differ between x86 (32-bit) and AMD64/x86_64 (64-bit) versions.
See this StackOverflow post for some more detail.
Code for this post in full is available as a Gist.
Exit
The exit syscall is how a child process communicates with the process that spawned it (its parent) when the child is finished running. Exit takes one argument, called the exit code or status code. It is an arbitrary signed 8-bit integer. If the high bit is set (i.e. the number is negative), this is interpreted to mean the process exited abnormally such as due to a segfault. Shells additionally interpret any non-zero exit code as a "failure". Otherwise, and ignoring these two common conventions, it can be used to mean anything the programmer wants.
The wait syscall is how the parent process can block until exit is called by the child and receive its exit code.
On AMD64 Linux the syscall number is 60. For example:
MOV RDI, 0
MOV RAX, 60
SYSCALL
This calls exit with a status code of 0.
Write
The write syscall is how a process can send data to file descriptors, which are integers representing some file-like object. By default, a Linux process is given access to three file descriptors with consistent integer values: stdin is 0, stdout is 1, and stderr is 2. Write takes three arguments: the file descriptor integer to write to, a starting address to memory that is interpreted as a byte array, and the number of bytes to write to the file descriptor beginning at the start address.
On AMD64 Linux the syscall number is 1. For example:
MOV RDI, 1 ; stdout
MOV RSI, R12 ; address of string
MOV RDX, 8 ; 8 bytes to write
MOV RAX, 1 ; write
SYSCALL
This writes 8 bytes to stdout starting from the string whose address is in R12.
Implementing syscalls
Our emulator is simplistic and is currently only implementing process emulation, not full CPU emulation. So the syscalls themselves will be handled in JavaScript. First we'll write out stubs for the two syscalls we are adding. And we'll provide a map from syscall id to the syscall.
const SYSCALLS_BY_ID = {
1: function sys_write(process) {},
60: function sys_exit(process) {},
};
We need to add an instruction handler to our instruction switch. In
doing so we must convert the value in RAX from a BigInt
to a regular Number so we can look it up in the syscall map.
case 'syscall': {
const idNumber = Number(process.registers.RAX);
SYSCALLS_BY_ID[idNumber](process);
process.registers.RIP++;
break;
}
Exit
Exit is really simple. It will be implemented by calling Node's
global.process.exit(). Again we'll need to convert the
register's BigInt value to a Number.
const SYSCALLS_BY_ID = {
1: function sys_write(process) {},
60: function sys_exit(process) {
global.process.exit(Number(process.registers.RDI));
},
};
Write
Write will be implemented by iterating over the process memory as
bytes and by calling write() on the relevant file
descriptor. We'll store a map of these on the process object and
supply stdout, stderr, and stdin proxies on startup.
function main(file) {
...
const process = {
registers,
memory,
instructions,
labels,
fd: {
// stdout
1: global.process.stdout,
}
};
...
}
The base address is stored in RSI, the number of bytes to
write are stored in RDX. And the file descriptor to write
to is stored in RDI.
const SYSCALLS_BY_ID = {
1: function sys_write(process) {
const msg = BigInt(process.registers.RSI);
const bytes = Number(process.registers.RDX);
for (let i = 0; i < bytes; i++) {
const byte = readMemoryBytes(process, msg + BigInt(i), 1);
const char = String.fromCharCode(Number(byte));
process.fd[Number(process.registers.RDI)].write(char);
}
},
...
All together
$ cat exit3.asm
main:
MOV RDI, 1
MOV RSI, 2
ADD RDI, RSI
MOV RAX, 60 ; exit
SYSCALL
$ node emulator.js exit3.asm
$ echo $?
3
$ cat hello.asm
main:
PUSH 10 ; \n
PUSH 33 ; !
PUSH 111 ; o
PUSH 108 ; l
PUSH 108 ; l
PUSH 101 ; e
PUSH 72 ; H
MOV RDI, 1 ; stdout
MOV RSI, RSP ; address of string
MOV RDX, 56 ; 7 8-bit characters in the string but PUSH acts on 64-bit integers
MOV RAX, 1 ; write
SYSCALL
MOV RDI, 0
MOV RAX, 60
SYSCALL
$ node emulator.js hello.asm
Hello!
$
Next steps
We still aren't setting flags appropriately to support conditionals, so that's low-hanging fruit. There are some other fun syscalls to implement that would also give us access to an emulated VGA card so we could render graphics. Syntactic support for string constants would also be convenient and more efficient.
Latest post in the emulator basics series up: implementing some syscalls starting with sys_exit and sys_write so we can print a nice hello message. https://t.co/NEfId0lnJx #javascript #x86
— Phil Eaton (@phil_eaton) July 20, 2019