Written by Julie Zelenski
Before you dive in, it might help to work through some thought exercises to bring our your inner bitwise ninja. Answering these questions is not required, nor to be submitted. You are encouraged to freely discuss these warmup questions and answers with your peers (in person, on forum) to help everyone reach a solid foundation on the basic concepts before starting.
You also may find it helpful to review the bits/int/float exercises from labs 3 and 4, especially if you left tasks unfinished or feel you still have gaps in your understanding.
1 << 31
or 0x80000000
and both are quite readable. The mask also happens to be unsigned decimal value 2147483648
, but who can tell? Don't make yourself (or the reader) have to translate between binary and decimal to follow along.pow(2, 31)
is equivalent to 1 << 31, but costs orders of magnitude more compute cycles. Similarly using integer multiply, divide, and mod by powers of two to move/isolate/change bits is wasteful when the bitwise operations can do these tasks directly and efficiently.unsigned
that you intend to manipulate bitwise. By declaring as unsigned, you will avoid any special treatment of the most significant bit as a sign bit. Whereas the bitwise operations & | ^ ~ and << don't give any special status to the MSB, you have to take care with the right shift operator >> . A logical right shift always shifts zeros in to fill the vacated bits on the left, an arithmetic right shift copies the previously leftmost bit to fill the vacated bits (thus replicating the sign bit). Right shifting a unsigned value always performs a logical shift. Right shifting a signed value performs an arithmetic shift using our gcc compiler. (However, the C standard leaves this choice up to the implementation, so you could run into a different behavior on another compiler).Students who aren't yet comfortable with bits often end up expressing operations in roundabout/convoluted ways. For example, compare the two approaches below for testing the third bit of a value:
shifted = val >> 3; // shift target bit over to lsb
if ((shifted & 1) != 0) ... // then test if lsb is on
// Better version:
if ((val & (1 << 3)) != 0) ... // test target bit in-place with mask
The first version right-shifts the bits over to get the target bit in the 0th position and uses a mask to test the LSB, where as the second tests the target bit in place using a targeted mask. The second is preferred--- it is more idiomatic and the mask can be optimized into a compile-time constant and moved to a #define constant for readability.
Here's another example: you need to extract the 4 high-order bits of a byte and shift them down to the lower half-byte. You could do it with two steps (mask and shift), but in fact, the shift will already discard the low-order bits, so no need to mask them off first:
masked = val & 0xf0; // hex f0 is 1111 0000 in binary
result = masked >> 4;
// Better version:
result = val >> 4; // instead just directly shift, low-order bits gone!
And lastly, look at this bit of code below that right-shifts then immediately left-shifts back. At first glance, this seems like it would just cycle the bits back to where they were originally -- what's the point? Trace carefully, the right-shift will discard the LSB and then zero-fill it when left-shifting. This is roundabout way to wipe the LSB, but the better approach is to just directly mask it off as the second version below does:
result = (val >> 1) << 1;
// Better version:
result = val & ~1; // turn off LSB if on
At first you may find yourself using the more long-winded expressions as you think it through, but reviewing in a later pass, you can spot these issues and clean them up. Eventually you want to become facile enough with bit manipulations that you go straight for the streamlined, direct expression from the beginning.
Remember gdb can print/examine in binary using the t
format (e.g. using /1wt
). That will come in handy when working with bits.
cmp_bits
that don't rely on counting, but instead take advantage of the bitwise properties. For example, it is possible to quickly eliminate the bits in common since they don't change the comparison result. You can also streamline by processing both integers in a single loop and stopping as soon as you can determine the result. If you go this route, we encourage you to play around with how you can streamline, as this exploration will strengthen your understanding of bitwise manipulation, but we will accept any simple, reasonably efficient implementation (such as the dual bit-count and subtract).builtin_popcount.
You might find it a useful testing tool to compare against your own counter, but will receive no credit for using it to implement bit counting.&
|
and ^
can be thought of as operations on two sets. What is the effect of bitwise &
as a set operation? What about |
or ^
?is_single
may come down to being able to efficiently determine if only a single bit is on. A simple tactic is to count "on" bits and compare the count to 1, but this is one of the inefficient/looping approaches that we mention can work but isn't the most direct. Instead think about which bit patterns have a single "on" bit. What do they represent numerically? Is there anything interesting about those numbers and their relationship to other nearby numbers that could help you identify whether you have one of those numbers or not?cmp_bits
. As an alternative, you can consider what is the relationship between a value and its bit-count. What values have bit-count 2? What values have bit-count 7 or 8?INT_MIN
/INT_MAX
/UINT_MAX
and so on for char/short/etc. These constants can't be directly used in your saturating operations as they are bitwidth-specific, but you can examine the values to discern their patterns).int
by default. You can add a suffix to the constant to dictate its type. For example, a suffix of L indicates the type should be long, LL is for long long. You can also use U (either by itself or before L or LL) to indicate you want the constant to be unsigned.&
applied to a char and an int). The bitwise shift and invert operations promote the operand to an int, so bit-shifting a char will promote to an int before shifting, bit-inverting of a char similarly promotes, and the default result type of these expressions will be int. If the bitwidth of the operand was larger than int, the bitwidth will be unchanged by these promotion rules.sizeof
operator!stype/utype
is controlled by the SAT_CHOICE
#define at the top of bits.h
. The starter Makefile includes additional targets (char, short, int, long and longlong) that configure the #define and rebuild bitstest. This is a more convenient way to try different bitwidths that repeatedly editing the bits.h file (although the latter does work as well).echo_sequence
function will echo back the values assigned into your sequence. Look carefully to spot values that were rounded on assignment and adjust them as necessary.float.h
header defines a constant FLT_EPSILON
. The name suggests it could be the resolution between any pair of neighboring float values (impossible!) or perhaps some minimum magnitude float, but it's much less exciting than that. FLT_EPSILON
is defined as the distance from the float value 1.0 to its next higher neighbor.disassemble
. It will be helpful to ask gdb to disassemble the same bytes as a comparison for testing. The x/i
command in gdb will examine a memory address and prints its contents disassembled into instructions.%esp
. Which is it? The explanation is that %esp
is disallowed as the base in an push indirect-with-displacement because its register code is overloaded to use as the marker for the push scaled-index subtype. The ISA designers had to sacrifice one register to do this and chose the register they deemed least likely to be used. This means %esp
as base for indirect-with-displacement has to be encoded using the longer scaled-index form. There is a somewhat similar overlay on the push indirect with %esp
or %ebp
as base. This choice by the designers avoids consuming another bit of encoding for most cases, while making one special case longer. (oh, the joys of CISC architectures...) For the purposes of the assignment, just ignore the possibility of the base being %esp
or %ebp
for indirect or indirect-with-displacement. This avoids the ambiguity and you won't need to make a special case out of anything.A note on gdb and float/doubles
You can invoke function calls at the gdb prompt and it will make a valiant effort to invoke the function and print its result. This is pretty handy for working more interactively/experimentally. But be warned that the mechanism can be a little squirrely. For example, if the called function takes or returns variables of float/double type, you may get garbled results. The issue has to do with gdb not having access to the complete type information and doing a bit of guessing. You can add a typecast on the function being called (to its proper type, of course) to force the right interpretation. For example:
(gdb) p fabs(-3.5)
$1 = 1 // huh ??? why is result from fabs so wrong?
(gdb) p ((double (*)(double))fabs)(-3.5) // tell gdb fabs takes double, returns double
$2 = 3.5
Consider this the one exception to the rule that you should never ever cast a function pointer. :-)
A rabbit
bit
A little bit
An itty-bitty
Little bit of beet.
Then bit
By bit
He bit
Because he liked the taste of it.
But when he bit
A wee bit more,
It was more bitter than before.
"This beet is bitter!"
Rabbit cried.
"I feel a bit unwell inside!"
But when he bit
Another bite, that bit of beet
Seemed quite all right.
Besides
When all is said and done,
Better bitter beet
Than none.
-- Mary Ann Hoberman