Combined Snapshot of Highlighted Pages


Page 1: Organizing Code Files in C and C++



Organizing Code Files in C and C++

Organizing Code Files in C and C++

Published Apr 06 2002 09:35 PM in C and C++ *****
By Ben Sizer

Introduction

While many simple programs fit into a single C or CPP source file, any serious project is going to need splitting up into several source files in order to be manageable. However, many beginning programmers may not realize what the point of this is - especially since some may have tried it themselves and run into so many problems that they decided it wasn't worth the effort. This article should explain why to do it, and how to do it properly. Where necessary, I give a brief explanation of how compilers and linkers work to help understand why you have to do things in a certain way.

Terminology

In this article, I will call standard C and C++ files (usually with the .C or .CPP extension) "source files". This will be to distinguish them from "header files" (usually with the .H or .HPP extension). This terminology is also used by Visual C++ and most books. Note that the difference is purely conceptual - they are both just text files with code inside them. However, as the rest of the article will show, this difference in the way you treat source and header files is very important.

Why split code into several files?

The first question some novice programmers ask when they see a directory full of separate code files is, "why isn't it all just in one file?" To them, they just don't see the point of scattering the code about.

Splitting any reasonably-sized project up buys you some advantages, the most significant of which are the following:
  • Speed up compilation - most compilers work on a file at a time. So if all your 10000 lines of code is in one file, and you change one line, then you have to recompile 10000 lines of code. On the other hand, if your 10000 lines of code are spread evenly across 10 files, then changing one line will only require 1000 lines of code to be recompiled. The 9000 lines in the other 9 files will not need recompiling. (Linking time is unaffected.)
  • Increase organization - Splitting your code along logical lines will make it easier for you (and any other programmers on the project) to find functions, variables, struct/class declarations, and so on. Even with the ability to jump directly to a given identifier that is provided in many editors and development environments (such as Microsoft Visual C++), there will always be times when you need to scan the code manually to look for something. Just as splitting the code up reduces the amount of code you need to recompile, it also reduces the amount of code you need to read in order to find something. Imagine that you need to find a fix you made to the sound code a few weeks ago. If you have one large file called GAME.C, that's potentially a lot of searching. If you have several small files called GRAPHICS.C, MAINLOOP.C, SOUND.C, and INPUT.C, you know where to look, cutting your browsing time by 3/4.
  • Facilitate code reuse - If your code is carefully split up into sections that operate largely independently of each other, this lets you use that code in another project, saving you a lot of rewriting later. There is a lot more to writing reusable code than just using a logical file organization, but without such an organization it is very difficult to know which parts of the code work together and which do not. Therefore putting subsystems and classes in a single file or carefully delineated set of files will help you later if you try to use that code in another project.
  • Share code between projects - The principle here is the same as with the reuse issue. By carefully separating code into certain files, you make it possible for multiple projects to use some of the same code files without duplicating them. The benefit of sharing a code file between projects rather than just using copy-and-paste is that any bug fixes you make to that file or files from one project will affect the other project, so both projects can be sure of using the most up-to-date version.
  • Split coding responsibilities among programmers - For really large projects, this is perhaps the main reason for separating code into multiple files. It isn't practical for more than one person to be making changes to a single file at any given time. Therefore you would need to use multiple files so that each programmer can be working on a separate part of the code without affecting the file that the other programmers are editing. Of course, there still have to be checks that 2 programmers don't try altering the same file; configuration management systems and version control systems such as CVS or MS SourceSafe help you here.
All of the above can be considered to be aspects of modularity, a key element of both structured and object-oriented design.

How to do it: The Basics

By now you're probably convinced that there are benefits to splitting up your project into several smaller files. So, how would you go about it? Although some of the decisions you make will be reasonably arbitrary, there are some basic rules that you should follow to ensure that it all works.

Firstly, look at how you would split your code into sections. Often this is by splitting it into separate subsystems, or 'modules', such as sound, music, graphics, file handling, etc. Create new files with meaningful filenames so that you know at a glance what kind of code is in them. Then move all the code that belongs to that module into that file. Sometimes you don't have clear module - some would say this should send out warnings about the quality of your design! There may still be other criteria you use for splitting up code, such as the structures it operates upon. (Often "general purpose" functions can be split into string-handling and number-handling, for example.) And occasionally a module could be split into two or more files, because it might make sense to do so on logical grounds.

Once you have split it up in this way into separate source files, the next stage is to consider what will go into the header files. On a very simple level, code that you usually put at the top of the source file is a prime candidate for moving into a separate header file. This is presumably why they got termed 'header' files, after all.

This code to go in a header usually includes some or all of the following:
  • class and struct definitions
  • typedefs
  • function prototypes
  • global variables (but see below)
  • constants
  • #defined macros
  • #pragma directives
(Additionally, when using C++, templates and inline functions usually need to be in the header file. The reasons for this should become clear later.)

For examples, look at the standard C libraries that come with any C or C++ compiler. Stdlib.h is one good example; browse to your include directory and open it up in an editor to view it. (Or to save time in MS Visual C++ 6, type in into a source file, right-click on it, and choose 'Open Document "stdlib.h"'.) You will notice that they have some or all of the above programming constructs, but no actual function code. Similarly, you may see that any global variable declarations are preceded by the 'extern' qualifier. This is important, but more on this later.

You generally want one header file for every source file. That is, a SPRITES.CPP probably needs a SPRITES.H file, a SOUND.CPP needs a SOUND.H, and so on. Keep the naming consistent so that you can instantly tell which header goes with which normal file.

These header files become the interface between your subsystems. By #including a header, you gain access to all the structure definitions, function prototypes, constants etc for that subsystem. Therefore, every source file that uses sprites in some way will probably have to #include "sprite.h", every source file that uses sound may need to #include "sound.h", and so on. Note that you use quotes rather than angular brackets when #including your own files. The quotes tell the compiler to look for your headers in the program directory first, rather than the compiler's standard headers.

Remember that, as far as the compiler is concerned, there is absolutely no difference between a header file and a source file. (Exception: some compilers will refuse to compile a header file directly, assuming you made a mistake in asking.) As stated earlier, they are both just plain text files that are filled with code. The distinction is a conceptual one that programmers must adhere to in order to keep the logical file structure intact. The key idea is that headers contain the interface, and the source files contain the actual implementation. This applies whether you are working in C or C++, in an object-oriented way or a structural way. This means that one source file uses another source file via the second source file's header.

Potential Pitfalls

The rules given above are fairly vague and merely serve as a starting point for organizing your code. In simple cases, you can produce completely working programs by following those guidelines. However there are some more details that have to be accounted for, and it is often these details that cause novice programmers so much grief when they first start splitting their code up into header files and normal files.

In my experience, there are four basic errors that people encounter when they first enter the murky world of user-defined header files.
  • The source files no longer compile as they can't find the functions or variables that they need. (This often manifests itself in the form of something similar to "error C2065: 'MyStruct' : undeclared identifier" in Visual C++, although this can produce any number of different error messages depending on exactly what you are trying to reference.)
  • Cyclic dependencies, where headers appear to need to #include each other to work as intended. A Sprite may contain a pointer to the Creature it represents, and a Creature may contain a pointer to the Sprite it uses. No matter how you do this, either Creature or Sprite must be declared first in the code, and that implies that it won't work since the other type isn't declared yet.
  • Duplicate definitions where a class or identifier is included twice in a source file. This is a compile time error and usually arises when multiple header files include one other header file, leading to that header being included twice when you compile a source file that uses them. (In MSVC, this might look something like "error C2011: 'MyStruct' : 'struct' type redefinition.)
  • Duplicate instances of objects within the code that compiled fine. This is a linking error, often difficult to understand. (In MSVC, you might see something like "error LNK2005: "int myGlobal" (?myGlobal@@3HA) already defined in myotherfile.obj".)
So how do we fix these issues?

Fixing Problem 1

Luckily, these issues are easy to fix, and even easier to avoid once you understand them.

The first error, where a source file refuses to compile because one of the identifiers was undeclared, is easy to resolve. Simply #include the file that contains the definition of the identifier you need. If your header files are organized logically and named well, this should be easy. If you need to use the Sprite struct, then you probably need to #include "sprite.h" in every file that does so. One mistake that programmers often make is to assume that a file is #included simply because another header #includes it for itself.

Example:
/* Header1.h */
#include "header2.h"
class ClassOne { ... };

/* Header2.h */
class ClassTwo { ... };

/* File1.cpp */
#include "Header1.h"
ClassOne myClassOne_instance;
ClassTwo myClassTwo_instance;

In this case, File1.cpp will compile fine, as including Header1.h has indirectly #included Header2.h, meaning that File1.cpp has access to the Class2 class. But what happens if, at a later date, someone realises that Header1.h doesn't actually need to #include Header2.h? They can remove that #include line, and suddenly File1.cpp will break the next time you try to compile it.

The key here, is to explicitly #include any header files that you need for a given source file to compile. You should never rely on header files indirectly including extra headers for you, as that may change. The 'extra' #includes also serve as documentation, by demonstrating what other code this file is dependent on. So don't try and leave them out if you know you need that header included somehow.

Fixing Problem 2

Cyclic (or two-way) dependencies are a common problem in software engineering. Many constructs involve a two-way link of some sort, and this implies that both classes or structures know about each other. Often this ends up looking like this:

/* Parent.h */
#include "child.h"
class Parent
{
Child* theChild;
};

/* Child.h */
#include "parent.h"
class Child
{
Parent* theParent;
};

Given that one of these has to be compiled first, you need some way to break the cycle. In this case, it's actually quite trivial. The Parent struct doesn't actually need to know the details of the Child class, as it only stores a pointer to one. Pointers are pretty much the same no matter what they point to, therefore you don't need to the definition of the structure or class in order to store a pointer to an instance of that structure or class. So the #include line is not needed. However, simply taking it out will give you an "undeclared identifier" error when it encounters the word 'Child', so you need to let the compiler know that Child is a class or class that you wish to point to. This is done with a forward declaration, taking the form of a class or class definition without a body. Example:

/* Parent.h */
class Child; /* Forward declaration of Child; */
class Parent
{
Child* theChild;
};

Notice how the #include line is replaced by the forward declaration. This has allowed you to break the dependency between Parent.h and Child.h. Additionally, it will speed up compilation as you are reading in one less header file. In this case, the same procedure can (and should) be followed in Child.h by forward declaring "class Parent;" As long as you are only referring to a pointer and not the actual type itself, you don't need to #include the full definition. In 99% of cases, this can be applied to one or both sides of a cycle to remove the need to #include one header from another, eliminating the cyclic dependency.

Of course, in the source files, it's quite likely that there will be functions that apply to Parent that will manipulate the Child also, or vice versa. Therefore, it is probably necessary to #include both parent.h and child.h in parent.c and child.c.

Attached Image: gdarticle_cyclic1.png
Another situation where I have found cyclic dependencies to arise in C++ is where functions are defined in the header file to make them inline (and hence potentially faster). In order for the function to operate, it often needs to know the details of a class it operates on. This tends to mean that the header file where the inline function is declared needs to #include whichever headers are necessary for the function to compile. The first piece of advice here is that you should only make functions inline when you are sure that they are too slow otherwise. If you have tested your code and certain functions absolutely need to be inlined (and thus in the header file), try and ensure that the dependency between 2 header files is only one way by isolating the inline functions in one of the two headers.

Note that being able to eliminate a dependency entirely is not always possible. Many classes and structs are composed of other classes and structs, which is a dependency you cannot avoid. However, as long as this dependency is one-way, the order of compilation will be fixed and there should be no problem.

There are more in-depth ways of resolving cyclic dependencies, but they are beyond the scope of this article. In 99% of cases, using forward declarations and favoring normal functions in C./.CPP files over inline functions in header files will be enough.

Fixing Problem 3

Duplicate definitions at compile time imply that a header ended up being included more than once for a given source file. This leads to a class or struct being defined twice, causing an error. The first thing you should do is ensure that, for each source file, you only include the headers that you need. This will aid compilation speed since the compiler is not reading and compiling headers that serve no purpose for this file.

Sadly, this is rarely enough, since some headers will include other headers. Let's revisit an example from earlier, slightly modified:

/* Header1.h */
#include "header3.h"
class ClassOne { ... };

/* Header2.h */
#include "header3.h"
class ClassTwo { ... };

/* File1.cpp */
#include "Header1.h"
#include "Header2.h"
ClassOne myClassOne_instance;
ClassTwo myClassTwo_instance;

Header1.h and Header2.h #include header3.h for some reason. Maybe ClassOne and ClassTwo are composed out of some class defined in Header3.h.. The reason itself is not important; the point is that sometimes headers include other headers without you explicitly asking for it, which means that sometimes a header will be #included twice for a given source file, despite your best intentions. Note that it doesn't matter that header3.h is being #included from different header files: during compilation, these all resolve to one file. The #include directive literally says "include the specified file right here in this file while we process it", so all the headers get dumped inline into your source file before it gets compiled. The end result looks something like this:

Attached Image: gdarticle_includefiles.png
For the purposes of compilation, File1.cpp ends up containing copies of Header1.h and Header2.h, both of which include their own copies of Header3.h. The resulting file, with all headers expanded inline into your original file, is known as a translation unit. Due to this inline expansion, anything declared in Header3.h is going to appear twice in this translation unit, causing an error.

So, what do you do? You can't do without Header1.h or Header2.h, since you need to access the structures declared within them So you need some way of ensuring that, no matter what, Header3.h is not going to appear twice in your File1.cpp translation unit when it gets compiled. This is where inclusion guards come in.

If you looked at stdlib.h earlier, you may have noticed lines near the top similar to the following:

#ifndef _INC_STDLIB
#define _INC_STDLIB

And at the bottom of the file, something like:

#endif  /* _INC_STDLIB */

This is what is known as an inclusion guard. In plain English, it says "if we haven't defined the symbol "_INC_STDLIB" yet, then define it and continue. Otherwise, skip to the #endif." This is a similar concept to writing the following code to get something to run once:

static bool done = false;
if (!done)
{
/* Do something */
done = true;
}

This ensures that the 'do something' only ever happens once. The same principle goes for the inclusion guards. During compilation of File1.cpp, the first time any file asks to #include stdlib.h, it reaches the #ifndef line and continues because "_INC_STDLIB" is not yet defined. The very next line defines that symbol and carries on reading in stdlib.h. If there is another "#include " during the compilation of File1.cpp, it will read to the #ifndef check and then skip to the #endif at the end of the file. This is because everything between the #ifndef and the #endif is only executed if "_INC_STDLIB" is not defined; and it got defined the first time we included it. This way, it is ensured that the definitions within stdlib.h are only ever included once by putting them within this #ifndef / #endif pair.

This is trivial to apply to your own projects. At the start of every header file you write, put the following:

#ifndef INC_FILENAME_H
#define INC_FILENAME_H

Note that the symbol (in this case, "INC_FILENAME_H") needs to be unique across your project. This is why it is a good idea to incorporate the filename into the symbol. Don't add an underscore at the start like stdlib.h does, as identifiers prefixed with an underscore are supposed to be reserved for "the implementation" (ie. the compiler, the standard libraries, and so on). Then add the #endif /* INC_FILENAME_H */ at the end of the file. The comment is not necessary, but will help you remember what that #endif is there for.

If you use Microsoft Visual C++, you may have also noticed 3 lines in stdlib.h similar to:

#if     _MSC_VER > 1000
#pragma once
#endif

Other compilers may have code similar to this. This effectively achieves the same thing as the inclusion guards, without having to worry about filenames, or adding the #endif at the bottom of the file, by saying that "for this translation unit, only include this file once". However, remember that this is not portable. Most compilers don't support the #pragma once directive, so it's a good habit to use the standard inclusion guard mechanism anyway, even if you do use #pragma once.

With inclusion guards in all your headers, there should be no way for any given header's contents to end up included more than once in any translation unit, meaning you should get no more compile-time redefinition errors.

So that's compile-time redefinition fixed. But what about the link-time duplicate definition problems?

Fixing Problem 4

When the linker comes to create an executable (or library) from your code, it takes all the object (.obj or .o) files, one per translation unit, and puts them together. The linker's main job is to resolve identifiers (basically, variables or functions names) to machine addresses in the file. This is what links the various object files together. The problem arises when the linker finds two or more instances of that identifier in the object files, as then it cannot determine which is the 'correct' one to use. The identifier should be unique to avoid any such ambiguity. So how come the compiler doesn't see an identifier as being duplicated, yet the linker does?

Imagine the following code:

/* Header.h */
#ifndef INC_HEADER_H
#define INC_HEADER_H
int my_global;
#endif /* INC_HEADER_H */

/* code1.cpp */
#include "header1.h"
void DoSomething()
{
++my_global;
}

/* code2.cpp */
#include "header1.h"
void DoSomethingElse()
{
--my_global;
}

This first gets compiled into two object files, probably called code1.obj and code2.obj. Remember that a translation unit contains full copies of all the headers included by the file you are compiling. Finally, the object files are combined to produce the final file.

Here's a visual depiction of the way these files (and their contents) are combined:

Attached Image: gdarticle_linker.png
Notice how there are two copies of "my_global" in that final block. Although "my_global" was unique for each translation unit (this would be assured by the use of the inclusion guards), combining the object files generated from each translation unit would result in there being more than one instance of my_global in the file. This is flagged as an error, as the linker has no way of knowing whether these two identifiers are actually same one, or if one of them was just misnamed and they were actually supposed to be 2 separate variables. So you have to fix it.

The answer is not to define variables or functions in headers. Instead, you define them in the source files where you can be sure that they will only get compiled once (assuming you don't ever #include any source files, which is a bad idea for exactly this reason). This gives you a new problem: how do you make the functions and variables globally visible if they aren't in a common header any more? How will other files "see" them? The answer is to declare the functions and variables in the header, but not to define them. This lets the compiler know that the function or variable exists, but delegates the act of resolving the address to the linker.

To do this for a variable, you add the keyword 'extern' before its name:

extern int my_global;

The 'extern' specifier is like telling the compiler to wait until link time to resolve the 'connection'. And for a function, you just put the function prototype:

int SomeFunction(int parameter);
Functions are considered 'extern' by default so it is customary to omit the 'extern' in a function prototype.

Of course, these are just declarations that my_global and SomeFunction exist somewhere. It doesn't actually create them. You still have to do this in one of the source files, as otherwise you will see a new linker error when it finds it cannot resolve one of the identifiers to an actual address. So for this example, you would add "int my_global" to either Code1.cpp or Code2.cpp, and everything should work fine. If it was a function, you'd add the function including its body (ie. the code of the function) into one of the source files.

The rule here is to remember that header files define an interface, not an implementation. They specify which functions, variables, and objects exist, but it is not responsible for creating them. They may say what a struct or class must contain, but it shouldn't actually create any instances of that struct or class. They can specify what parameters a function takes and what it returns, but not how it gets the result. And so on. This is why the list of what can go into a header file earlier in this article is important.

For convenience, some people like to put all the 'extern' declarations into a Globals.h file, and all the actual definitions into a Globals.cpp file. This is similar to the approach MS Visual C++ takes in the automatically generated projects by providing stdafx.h and stdafx.cpp. Of course, most experienced programmers would tell you that global variables are generally bad, and therefore making an effort to reduce or eliminate them will improve your program anyway. Besides, most so-called globals don't need to be truly global. Your sound module probably doesn't need to see your Screen object or your Keyboard object. So try to keep variables in their own modules rather than placing them all together just because they happen to be 'global'.

There are two notable exceptions to the "no function bodies in header files", because although they look like function bodies, they aren't exactly the same.

The first exception is that of template functions. Most compilers and linkers can't handle templates being defined in different files to that which they are used in, so templates almost always need to be defined in a header so that the definition can be included in every file that needs to use it. Because of the way templates are instantiated in the code, this doesn't lead to the same errors that you would get by defining a normal function in a header. This is because templates aren't compiled at the place of definition, but are compiled as they are used by code elsewhere.

The second exception is inline functions, briefly mentioned earlier. An inline function is compiled directly into the code, rather than called in the normal way. This means that any translation unit where the code 'calls' an inline function needs to be able to see the inner workings (ie. the implementation) of that function in order to insert that function's code directly. This means that a simple function prototype is insufficient for calling the inline function, meaning that wherever you would normally just use a function prototype, you need the whole function body for an inline function. As with templates, this doesn't cause linker errors as the inline function is not actually compiled at the place of definition, but is inserted at the place of calling.

Other Considerations

So, your code is nicely split across various files, giving you all the benefits mentioned at the start like increased compilation speed and better organization. Is there anything else you need to know?

Firstly, if you use the C++ standard library, or any other library that uses namespaces, you may find yourself using their identifiers in your header files. If so, don't use the "using" keyword in your headers, as it reduces the effectiveness of namespaces almost to the point of uselessness. Instead, put the using keyword in the source files if necessary, and explicitly qualify the names with their namespace prefix in the header files. Otherwise, when you start using functions that have the same names as the standard library, whether your own or from another library, you'll either have to start renaming them, or go back and edit your header anyway, which in turn will affect all the files that depend on it. Better to do it properly in the first place.

Secondly, the use of macros should be carefully controlled. C programmers have to rely on macros for a lot of functionality, but C++ programmers should avoid them wherever possible. If you want a constant in C++, use the 'const' keyword. If you want an inline function in C++, use the 'inline' keyword. If you want a function that operates on different types in C++, use templates or overloading. But if you need to use a macro for some reason, and you place it in a header file, try not to write macros that could potentially interfere with code in the files that include it. When you get weird compilation errors, you don't want to have to search through every header file to see if someone used a #define to inadvertently change your function or its parameters to something else. So wherever possible, keep macros out of headers unless you can be sure that you don't mind them affecting everything in the project.

Conclusion

If done properly, organizing your files this way can speed your development without hindering you at all or introducing any extra problems. Although it seems like there is a lot to consider when doing so, it becomes second nature in no time. Just remember these key points:
  • Split your code up along logical module divisions
  • The the interface into the header files and the implementation in the source files
  • Use forward declarations wherever possible to reduce dependencies
  • Add an inclusion guard to every header file you make
  • Don't put code in the headers that might adversely affect source files that include it
Good luck!



0 Comments

Page 1 of 1

No comments have been made yet

Page 1 of 1



Feedback

User Information

Feedback

 



Page 2: How do i use a header file in a c++ program? - Yahoo! Answers



How do i use a header file in a c++ program? - Yahoo! Answers
Make Y! Your Homepage
  1. Home >
  2. All Categories >
  3. Computers & Internet >
  4. Programming & Design >
  5. Resolved Question
logansiebenmorgen logansie...
Member since:
July 15, 2007
Total points:
286 (Level 2)

Resolved Question

Show me another »

How do i use a header file in a c++ program?

I have constantly tried to use classes in my c++ programs in header files but they never work, i always either get a error that says the included file doesn't exist or the program runs then just closes unexpectedly when it tries to use the header file.

Additional Details

I have constantly tried to use classes in my c++ programs in header files but they never work, i always either get a error that says the included file doesn't exist or the program runs then just closes unexpectedly when it tries to use the header file. i use double quotations, i changed the file name from .h to .hpp, and i have included the file into the project itself and supposedly that means i just need to use the name not a location, i get no errors, the program just closes...

4 years ago

jonathanlks by jonathan...
Member since:
June 26, 2007
Total points:
4,520 (Level 4)

Best Answer - Chosen by Voters

First if your putting classes into header files, make sure those header files are saved as .hpp files not .h files.

Second make sure you either are including your headers in only one place or they have preprocessor commands to stop multiple copies from being created (using #define, #ifndef and #endif)

P.S.
Even if the header file is included in the project, you can't just use the name with #include unless the header is in the same directory.

Also try putting your classes into your cpp files themselves and see if it's the classes and not putting them in headers that is causing your program to close. Also another major thing is if your program is a command line program, unless you have some kind of pause at the end (like a getch() or a system(PAUSE)), windows will by default close the window after the program finishes running.
100% 1 Vote

There are currently no comments for this question.

Other Answers (2)

  • Bean by Bean
    Member since:
    May 31, 2007
    Total points:
    1,258 (Level 3)
    If it is your own header file you have created, you need to use double quotes in the include statement. Heres an example:

    #include "MyClass.h"

    If you are storing your class in a subfolder of the program, such as 'Classes', you would use #include "Classes\MyClass.h"
    0% 0 Votes
  • Dragons Dinner by Dragons Dinner
    Member since:
    July 15, 2007
    Total points:
    407 (Level 2)
    If its compiling, then the problem isnt because of your header files

    The proper way to include it would be
    #include "myheader.h"

    Note you use .h and not .hpp, though it doesnt really matter
    You need to debug your program to see why its just quitting. It wil be something in the .c or .cpp that isnt working properly
    0% 0 Votes



Page 3: C Style



C Style

Defining Programming Standards   
for Professional Programmers 
  

         

Home

Contents

1: Standards

2: Psychological Factors

3: General Principles

4: Commenting

5: Naming

6: Code Layout

7: File Layout

8: Language Usage

9: Data Usage

10: Programming Usage

11: Implementing Standards

A: Example Standard

B: References

C: Glossary

Syque

About

Share this page:

Google
C Style
syque.com
Web

 

 

Books and
more at:

USA:

In association with amazon.com

UK:

In Association with Amazon.co.uk

Canada:

In Association with amazon.ca

 

 

CHAPTER 7 : File Layout

PART 3 : LAYOUT

CHAPTER 7 : File Layout
7.1 Layout of directories
7.2 Division of files
7.3 Considerations for File Layout
7.4 Header files
7.5 Layout of Data files
7.6 Layout of Code files
7.7 Summary

<--Prev page | Next page -->

 

7.4  Header files

Header files simplify code and data files by removing a possibly large number of declarations, etc. from the start of the file. In doing this, however, they make the removed items effectively invisible to the reader of the file. The programmer should counteract this by organizing the contents and layout of the header files so that the reader can easily find the missing items.

7.4.1  Division of header files

A header file should 'serve' a coherent group of code files. A small program might have just one header file. A larger program would have more header files, divided to reflect the division of these code files.

There are several possible uses for header files: a system interface file provides the external world with all it needs to access this system; a system common file contains common things used within this system; a system specific file contains items used only within a specific portion of this system.

System interface files may contain function prototypes, data interface to the system, including data structure declarations, error codes, etc. They act as the 'ambassador' for the subsystem, and this may be the only files that you write that other users ever read. Thus special care should be taken to ensure that these files are particularly complete, well structured and clearly commented.

There can be several levels of interface header files, depending on their effective scope for example to an external platform (eg. operating system, windowing system) or an internal subsystem (eg. keyboard interface).

7.4.2  Memory reservation

Header files can contain any C item, but commonly do not contain any data definitions (ie. those that reserve memory), as this could result in multiple instances of the same symbol being declared in different files, which would cause linker resolution problems. It also may be unclear who 'owns' the item. Similarly, executable code in header files is to be avoided (other than in macros). By insisting that header files are completely passive, causing no memory to be reserved for data or for code, readers and users can be confident that no definitions are 'invisible' to them.

7.4.3  Header file interdependency

It can make for confusing reading if header files interact or have some dependency on one another. Consider the case where one header file must be #include'd before another:

 

/* sysio.h */
#define MAX_BUF_SIZE   128

/* kbdio.h */
struct KBD_BUF
{
    char    KbdBuffer[MAX_BUF_SIZE];
    int     KbdStatus;
};

/* kbdctrl.c */
#include "kbdio.h"    /* WRONG: this file should be included after sysio.h */
#include "sysio.h"

-----------------------------------------------------------------------------

This makes it difficult for the reader of 'kbdio.h' to find where MAX_BUF_SIZE is declared. It is also an additional awkward piece of information to remember for anyone extending the system. Confusion caused by this can be avoided by nesting header files:

 

/* kbdio.h */
#include "sysio.h"
...
struct KBD_BUF
{
    char    KbdBuffer[MAX_BUF_SIZE];
    int     KbdStatus;
};

 

This, however, can cause problems of duplicate symbols (which are not allowed) if 'sysio.h' is included twice. This can be avoided by bracketing the contents of header files with an '#ifndef':

 

/* sysio.h */
#ifndef sysio_h
#define sysio_h
...
all statements in file
...
#endif

 

It is not always possible to eliminate interaction between header files without causing worse problems. Nevertheless, it will help to improve readability if interaction is minimized.

7.4.4  Header files and declarations

Where a data item is used by multiple functions, the best place for its description is in a single header file. This is then #include'd in the code file that defines and the file that uses that data. It also allows constants associated with the data to be defined within the same context.

 

/* kbddata.h */
typedef struct                        /* describe item in header file */
{
    char        KB_Buffer[KB_BUF_LEN];
    unsigned    KB_Status;
} KB_BUFFER;
/*
 *  values for KB_Status
 */
#define KB_FULL     0
#define KB_EMPTY    1

/* kbdread.c */
#include "kbdextn.h"
...
static KB_BUFFER KbdBuffer;           /* define it in 'owner' file      */
...
    ReadKbd( &KbdBuffer );

/* kbdio.c */
#include "kbddata.h"
...
int ReadKbd ( KB_BUFFER *KbdBuffer )  /* declare it in usage files     */

-------------------------------------------------------------------------------

7.4.5  Layout of header files

As with all files, the header file should start with a header comment, explaining its scope and contents, followed by a number of sections preceded by subheader comments. The layout of rest of the file will depend very much on its contents.

The layout could simply be by statement type, putting declarations earlier if they more likely be used elsewhere, thus: #include's, then #define's, enum's, struct's, typedef's and function declarations.

A better approach is to divide it by functional area so that all items pertaining to a single usage or group of usages are placed together (thus minimizing visual scope). A good test of this is that if the file becomes too big, the points at which it may be split are naturally available. Care must be taken when doing this that dependent items are in the correct order. This can be helped by sorting within these groupings by type, as above. Thus, a typical format might be:

 

   File heading comment

#ifndef filename_h

#define filename_h

   Include files

   Common context items:

       #define's, enum's

   Functional area 1:

       Heading comment

       #define's, enum's for context

       Data description 1:

                   Heading comment

                   enum's for context

                   typedef or struct to declare data item

                   #define's for constant values

       Data description 2:

          ...

       External function declarations

   Functional area 2:

...

#endif /* filename_h */

 

Note that having several functional areas in one header file is a part of the complexity balancing act between scope and file size. It may quite reasonable to have a header file containing only one functional area or even one data description, such as a message identification header file, which contains only #define's for external message numbers.

 

<--Prev page | Next page -->

 

 

  © Syque 1995-2010

Massive Content -- Maximum Speed   

 



Page 4: Program Organization



Program Organization
  C++ Programming Guide HomeContentsPreviousNextIndex


Chapter 2

Program Organization

The file organization of a C++ program requires more care than is typical for a C program. This chapter describes how to set up your header files, inline function definitions, and template definitions.

2.1 Header Files

Creating an effective header file can be difficult. Often your header file must adapt to different versions of both C and C++. To accommodate templates, make sure your header file is tolerant of multiple inclusions (idempotent), and is self-contained.

2.1.1 Language-Adaptable Header Files

You might need to develop header files for inclusion in both C and C++ programs. However, Kernighan and Ritchie C (K&R C), also known as "classic C," ANSI C, Annotated Reference Manual C++ (ARM C++), and ISO C++ sometimes require different declarations or definitions for the same program element within a single header file. (See the C++ Migration Guide for additional information on the variations between languages and versions.) To make header files acceptable to all these standards, you might need to use conditional compilation based on the existence or value of the preprocessor macros __STDC__ and __cplusplus.

The macro __STDC__ is not defined in K&R C, but is defined in both ANSI C and C++. Use this macro to separate K&R C code from ANSI C or C++ code. This macro is most useful for separating prototyped from nonprototyped function definitions.

#ifdef _ _STDC_ _
int function(char*,...);      // C++ & ANSI C declaration
#else
int function();               // K&R C
#endif 


The macro __cplusplus is not defined in C, but is defined in C++.


Note – Early versions of C++ defined the macro c_plusplus instead of _ _cplusplus. The macro c_plusplus is no longer defined.

Use the definition of the __cplusplus macro to separate C and C++. This macro is most useful in guarding the specification of an extern "C" interface for function declarations, as shown in the following example. To prevent inconsistent specification of extern "C", never place an #include directive within the scope of an extern "C" linkage specification.

#include "header.h"
...                     // ... other include files ... 
#if defined(_ _cplusplus)
extern "C" {
#endif
  int g1();
  int g2();
  int g3()
#if defined(_ _cplusplus)
}
#endif


In ARM C++, the __cplusplus macro has a value of 1. In ISO C++, the macro has the value 199711L (the year and month of the standard expressed as a long constant). Use the value of this macro to separate ARM C++ from ISO C++. The macro value is most useful for guarding changes in template syntax.

// template function specialization
#if _ _cplusplus < 199711L
int power(int,int);                       // ARM C++
#else
template <> int power(int,int);           // ISO C++
#endif


2.1.2 Idempotent Header Files

Your header files should be idempotent. That is, the effect of including a header file many times should be exactly the same as including the header file only once. This property is especially important for templates. You can best accomplish idempotency by setting preprocessor conditions that prevent the body of your header file from appearing more than once.

#ifndef HEADER_H
#define HEADER_H
/* contents of header file */
#endif


2.1.3 Self-Contained Header Files

Your header files should include all the definitions that they need to be fully compilable. Make your header file self-contained by including within it all header files that contain needed definitions.

#include "another.h"
/* definitions that depend on another.h */


In general, your header files should be both idempotent and self-contained.

#ifndef HEADER_H
#define HEADER_H
#include "another.h"
/* definitions that depend on another.h */
#endif


2.1.4 Unnecessary Header File Inclusion

Programs written in C++ typically include many more declarations than do C programs, resulting in longer compilation times. You can reduce the number of declarations through judicious use of several techniques.

One technique is to conditionally include the header file itself, using the macro defined to make it idempotent. This approach introduces an additional interfile dependence.

#ifndef HEADER_H
#include "header.h"
#endif



Note – System header files often include guards of the form _Xxxx, where X is an uppercase letter. These identifiers are reserved and should not be used as a model for constructing macro guard identifiers.

Another way to reduce compilation time is to use incomplete class and structure declarations rather than including a header file that contains the definitions. This technique is applicable only if the complete definition is not needed, and if the identifier is actually a class or structure, and not a typedef or template. (The standard library has many typedefs that are actually templates and not classes.) For example, rather than writing:

#include "class.h"
a_class* a_ptr;


write:

class a_class;
a_class* a_ptr;


(If a_class is really a typedef, the technique does not work.)

One other technique is to use interface classes and factories, as described in the book Design Patterns: Elements of Reusable Object-Oriented Software, by Erich Gamma (Addison-Wesley, 1994).

2.2 Inline Function Definitions

You can organize your inline function definitions in two ways: with definitions inline and with definitions included. Each approach has advantages and disadvantages.

2.2.1 Function Definitions Inline

You can use the definitions-inline organization only with member functions. Place the body of the function directly following the function declaration within the class definition.

class Class 
{
    int method() { return 3; }
};


This organization avoids repeating the prototype of the function, reduces the bulk of source files and the chance for inconsistencies. However, this organization can introduce implementation details into what would otherwise be read as an interface. You would have to do significant editing if the function became non-inline.

Use this organization only when the body of the function is trivial (that is, empty braces) or the function will always be inline.

2.2.2 Function Definitions Included

You can use the definitions-included organization for all inline functions. Place the body of the function together with a repeat (if necessary) of the prototype. The function definition may appear directly within the source file or be included with the source file

class Class {
    int method();
};
inline int Class::method() {
    return 3;
}


.

This organization separates interface and implementation. You can move definitions easily from header files to source files when the function is no longer implemented inline. The disadvantage is that this organization repeats the prototype of the class, which increases the bulk of source files and the chance for inconsistencies.

2.3 Template Definitions

You can organize your template definitions in two ways: with definitions included and with definitions separated. The definitions-included organization allows greater control over template compilation.

2.3.1 Template Definitions Included

When you put the declarations and definitions for a template within the file that uses the template, the organization is definitions-included. For example:

main.cc
template <class Number> Number twice( Number original );
template <class Number> Number twice( Number original )
    { return original + original; }
int main( )
    { return twice<int>( -3 ); }


When a file using a template includes a file that contains both the template's declaration and the template's definition, the file that uses the template also has the definitions-included organization. For example:

twice.h
#ifndef TWICE_H
#define TWICE_H
template <class Number> Number twice( Number original );
template <class Number> Number twice( Number original )
    { return original + original; }
#endif
main.cc
#include "twice.h"
int main( )
    { return twice( -3 ); }



Note – It is very important to make your template headers idempotent. (See Section 2.1.2 "Idempotent Header Files".)

2.3.2 Template Definitions Separate

Another way to organize template definitions is to keep the definitions in template definition files, as shown in the following example.

twice.h
template <class Number> Number twice( Number original );
twice.cc
template <class Number> Number twice( Number original )
    { return original + original; }
main.cc
#include "twice.h"
int main( )
    { return twice<int>( -3 ); }


Template definition files must not include any non-idempotent header files and often need not include any header files at all. (See Section 2.1.2 "Idempotent Header Files".)


Note – Although it is common to use source-file extensions for template definition files (.c, .C, .cc, .cpp, .cxx), template definition files are header files. The compiler includes them automatically if necessary. Template definition files should not be compiled independently.

If you place template declarations in one file and template definitions in another file, you have to be very careful how you construct the definition file, what you name it, and where you put it. You might also need to identify explicitly to the compiler the location of the definitions. Refer to C++ User's Guide for information about the template definition search rules.


Sun Microsystems, Inc.
Copyright information. All rights reserved.
Feedback
Library   |   Contents   |   Previous   |   Next   |   Index






Sites In Our Network

Join Our Network
Close