Deep C++  
Optional Equipment Package

Bobby Schmidt
Microsoft Corporation

June 11, 2002

In my previous column, I survey new language-conformance features in Microsoft® Visual C++® .NET. This time I look at some new Microsoft-specific language extensions. (As I've cautioned in similar situations, the list of features I show here is representative, not canonical.)

The C and C++ standards let implementers (such as Microsoft) extend the standard language and library, so long as the presence of the extensions doesn't harm standard-conformant code. This attitude is consistent with the language committees' philosophy of allowing innovation without forcing programmers to pay for what they don't use. While Visual C++ .NET is not flawless in this regard, it is leagues beyond previous versions.

Distressingly, the compiler options enabling these extensions are inconsistently applied. In theory, /Za disables language extensions, while /Ze (which is on by default) enables them. In practice, some of the extensions work with /Za, some don't. Adding to the fun: throwing /clr enables some extensions, regardless of /Za or /Ze. If there's a method in the madness, I sure can't divine it.

Naming Extensions
Many of the extensions manifest as Microsoft-specific keywords and identifiers. To prevent such extensions from stomping on top of other names, the standards partition the set of all possible names into three domains:

Some specific names are explicitly reserved by the standards. In the C standard, the reserved names are macros, keywords, and global-scope declarations. In the C++ standard, most reserved names that would be global in C are elements of namespace std.
Of the remaining possible names, those matching certain patterns are reserved for language and library implementers. The patterns differ slightly between C and C++; I summarize them below.
All other names are reserved for you, the programming proletariat. You still have to duke it out with vendors of third-party libraries and your own team members to avoid name clashes, but that's a problem beyond the standard's concern. And if you use a name that's reserved for the standard or implementers, you're on your own; compilers aren't required to diagnose such usage, and I don't know of any that do.
Names Microsoft May Usurp
The C standard specifically reserves these names for implementers of the standard C language and library:

All global names starting with _ (single underscore).
All names starting with _A, _B, …, _Z—that is, an underscore followed by an upper-case letter.
All names starting with __ (double underscore).
To make things simple, the style rule I use for C is:

Avoid all names starting with _.
This is overly restrictive but safe and easy to apply.

The C++ standard reserves a slightly different name set:

All global names starting with _, just as in C.
All names starting with _A, _B, …, _Z—again, just as in C.
All names containing __ anywhere, not just at the start. I suspect the C++ committee broadened this rule to better allow name decoration/mangling.
The style rule I use for C++ is a superset of the C++ rule:

Avoid all names starting with _.
Avoid all names containing __.
Unless you have some compelling reason to keep separate C and C++ style rules, I suggest you use the C++ rules for both languages. Note that you can safely use trailing single underscores in either language. This is the naming style I use for inclusion guards, data members, and other "private" names.

Stealth Name Usage
Implementers can use their names in non-obvious contexts. In particular, the names can show up as intrinsic statements or library calls that don't appear in the source (even preprocessed), but do appear in the object file. This is the real danger of using these names: You may scrub your standard-library headers and interfaces, and convince yourself that a certain name is available, only to find that your implementer has injected the name into your code during translation.

For illustration, compile the small example

#include <math.h>

int main(void)
   {
   return sin(0.);
   }

as either C or C++. In the mixed source/assembly listing generated by cl /Fas, you'll see this code implementing the return statement:

fldz
push   ecx
push   ecx
fstp   QWORD PTR [esp]
call   _sin
pop   ecx
pop   ecx
call   __ftol2

The generated code contains two hidden calls to implementation-specific functions, both of which have names following the standards' name-reservation rules.

The Managed Extensions for C++
I will dissect managed extensions in future columns. Here I just want to give the barest introduction.

According to this page in the Visual C++ .NET documentation, 18 keywords "implement various features of the Managed Extensions for C++." The strong suggestion: These keywords are the Managed Extensions for C++, or at least form the core of those extensions. The further implication: The keywords require compilation with /clr, and are therefore new to Visual C++ .NET. (Otherwise they wouldn't be "managed" extensions.)

I find the documentation ambiguous. Of the 18 keywords:

Private, protected, and public aren't specific to Microsoft, although in managed code they can be used in non-standard ways.
__finally is available in Visual C++ 6.0.
None of these four requires compilation /clr.

The remaining 14 keywords are new to Visual C++ .NET. Of those, three don't require /clr:

__event
__identifier
__interface
Of the remaining 11 that do require /clr, five can appear in native functions:

__abstract
__nogc
__sealed
__try_cast
__value
This leaves six that are specific to managed functions:

__box
__delegate
__gc
__pin
__property
__typeof
Reading the product documentation more closely, I note that the reference pages for the 11 keywords requiring /clr, plus __identifier, are nested under the main reference page describing Managed Extensions in general. Maybe those 12 are the "real" Managed Extension keywords.

Then again, this page suggests the 14 new keywords are the real Managed Extension keywords.

Regardless, I mentally organize the new keywords into these usage buckets:

__abstract, __delegate, __event, __interface, and __property are kinds of declared entities.
__gc, __nogc, and __value define whether and how a type is managed.
__try_cast and __typeof interrogate the type system.
__box and __pin bridge the chasm between the managed and unmanaged worlds.
__sealed inhibits derivation.
__identifier harmonizes names among source languages.
In future columns devoted to managed development, I'll likely explore the extensions in these groupings.

Other Keywords
The following are other keyword extensions new to Visual C++ .NET:

__alignof is kindred to sizeof, an operator accepting a type operand and yielding a size_t byte count (in this case, the operand's alignment).
__assume passes hints to the code optimizer. According to the documentation, the optimizer assumes that the keyword's operand is true. An operand of 0 would appear to conflict with this assumption; yet bizarrely, the docs advocate __assume(0)! Apparently __assume(0) renders a particular swath of code unreachable. The documentation gives the specific example of __assume(0) after the default label in a switch statement. This tells the optimizer to assume the default case can't be reached.
__if_exists and __if_not_exists test—at compile time—for the presence or absence of a potentially declared name. They are analogous to #ifdef and #ifndef for symbols not stripped by preprocessing.
__debugbreak places a breakpoint. It appears equivalent to the Tester's Friend, a.k.a. __asm int 3.
__hook, __raise, and __unhook interact with __event declarations. __hook and __unhook associate and dissociate event handlers and events, while __raise raises events.
__noop can be used in place of a function name in a function-call expression. Any function arguments following __noop will not be evaluated. When I test __noop in other contexts, it appears to evaluate as int-type constant 0. Since the documentation says nothing about such usage, I'm not sure if this is supported behavior.
__super aliases one of a class's bases, depending on which base class best matches __super's usage and context. For singly inherited classes, __super is analogous to the Java keyword super and the C# keyword base.
__w64 treats an object as if it were 64-bit for diagnostic purposes only. I explore the __w64 keyword in a previous column.
In addition, the hoary storage-class extension __declspec supports these new attributes:

__declspec(align(...)) control's a user-defined type's alignment.
__declspec(deprecated) tags a function as obsolete or potentially unsupported. Calls to such functions generate compile-time warning C4996.
__declspec(noinline) prevents a function from being expanded inline. __declspec(noinline) is the moral opposite of the standard keyword inline.
All of the keywords work in both managed and native code. None requires /clr. And some even work in C:

__alignof
__assume
__debugbreak
__declspec
__noop
__w64
Language Lawyer Alert
I don't think that __noop technically qualifies as a keyword, since it can be used as a non-macro identifier:

int __noop; // OK

and real keywords can't:

int friend;    // error
int __alignof; // error

I'm not actually sure what to call __noop, given how its behavior changes depending on context:

int x = __noop; // OK, __noop evaluates as 0
__noop(++x);    // OK, __noop is not evaluated; neither is ++x
int __noop;     // OK, __noop is a normal identifier
__noop(++x);    // now this is an error, since __noop is an int

Alleged Keyword Extensions
The types __m64, __m128, __m128d, and __m128i hold 64-bit and 128-bit values. Several headers declare a large collection of new processor-specific intrinsic functions that manipulate these types. These intrinsic functions have the semantics of inline functions, but require no definition. Many of the "calls" map to a single assembly-language instruction.

The functions are declared in four headers:

mmintrin.h wraps Intel® MMX (multimedia extension) instructions.
xmmintrin.h and fvec.h wrap Intel Streaming SIMD Extensions (or SSE) instructions.
mm3dnow.h wraps Advanced Micro Devices (AMD) 3DNow!™ instructions.
Example:

#include <mmintrin.h> // declares intrinsics for Intel MMX

class MM_wrapper
    {
public:
    operator __m64() const // type wraps 64-bit MMX register value
        {
        return _mm_cvtsi32_si64(i_); // call to intrinsic function
        }
private:
    int i_;
    // ...
    };

The Visual C++ .NET docs claim that the types __m64, et al, are keywords, which implies the compiler knows about them natively. Yet my testing shows that the types aren't keywords at all:

int __m64; // OK, but should be an error if __m64 is a keyword

Instead, they are actually declared in the headers along with the intrinsic functions:

#include <mmintrin.h>

int __m64; // now an error

I consider the documentation's claim a bug. Further, because both the types and the intrinsic functions require inclusion of a header file, I don't consider either set a true language extension. At best, they are library extensions that the compiler treats as special cases when generating code.

The types and intrinsic functions work in both C and C++.

Other Data Types
Visual C++ 6.0 predefines the non-standard integer types __int8, __int16, __int32, and __int64. These types are miscible with, but separate from, the standard fundamental integer types:

void f(__int32)
    {
    }

void f(int) // OK, overload of f(__int32)
    {
    }

This works even though both __int32 and int are represented identically in generated code as 32-bit signed integers.

In Visual C++ .NET, the first three of these types are actually aliases for char, short, and int:

void f(__int32)
    {
    }

void f(int) // error, redefinition of f(__int32)
    {
    }

The only exception is __int64, which has no alias among the standard primitive types.

Pragmas
Pragmas are hybrid oddities. The preprocessor directive that introduces them (#pragma) is standard, and must be supported by every compiler. What comes after the directive is implementation-defined, and might not be supported by any compiler. (There are no standard pragmas.)

Visual C++ .NET supports a new set of pragmas that Visual C++ 6.0 does not. Indeed, there's no guarantee that any other compiler supports them. Nonetheless, major compiler vendors would be wise to at least silently absorb Microsoft's pragmas; otherwise source translated by Visual C++—the galactic arm's most popular compiler—will likely generate warnings on those other systems.

The new pragmas:

conform throttles the behavior of compiler option /Zc:forScope, which controls the scope of declaration within for statements. I discuss this option here.
deprecated lists functions that are deprecated (in the C and C++ sense, not in the real English sense). It is functionally equivalent to declaring those same functions with __declspec(deprecated).
managed and unmanaged control whether a set of functions is managed or unmanaged. It is meaningful only with the /clr compiler option.
runtime_checks governs the behavior of compiler option /RTC, which I describe here.
section creates a section (with optional attributes) in an object file.
In addition, the old pragma pack has a new option show that displays the current packing alignment (as a warning) during compile time.

Other Preprocessor Directives
Visual C++ .NET supports two non-standard preprocessor directives: #using and #import. Both are available in only C++, not C. Both are also analogous to #include, in that they introduce "source" from an external context during C++ translation.

#using is new to Visual C++ .NET. (And no, it has nothing to do with the standard keyword using.) It imports metadata into a managed program, and thus must be used with the /clr compiler option. In an odd quirk, the #using directive appears in the preprocessed output generated by options /E, /EP, or /P. You can verify this by compiling the tiny example

#using <mscorlib.dll>

with cl /clr /EP and examining the preprocessor output.

#import is available in Visual C++ 6.0. It translates a native type library into a pair of C++ header files: the primary or .tlh header, and the secondary or .tli header. In effect, #import is the native (and primitive) analogue of #using. Like #using, #import works only in C++.

#import supports a large number of attributes that affect the generated headers. Visual C++ .NET extends #import with three new attributes:

embedded_idl preserves attribute-generated code in the primary header file.
no_dual_interfaces causes the wrapper for dual-interface methods to call those methods through IDispatch::Invoke instead of through a v-table.
no_smartpointers omits smart pointer (_com_ptr_t) declarations in the generated interfaces.
In counterpoint to #using, #import directives are executed and replaced in preprocessed output. To witness a stupendous example, compile

#import <mscorlib.tlb>

with cl /EP /Zs. On my system, the resulting preprocessed output stream is over 200,000 lines and 2,000,000 characters long.

Predefined Macros
Visual C++ .NET defines several new non-standard macros.

Ho-Hum
_MANAGED is defined as 1 for code compiled with /clr, and undefined otherwise.
Mildly Arousing
__COUNTER__ is a compile-time counter. It evaluates to the int constant 0 the first time it's expanded, 1 the second time, and so on:

#include <iostream>
using namespace std;

int main()
    {
    cout << __COUNTER__ << endl;
    cout << __COUNTER__ << endl;
    cout << __COUNTER__ << endl;
    }

/* run-time result:
0
1
2
*/

Note that the counter increments happen at compile-time, not at run time:

#include <iostream>
using namespace std;

int main()
    {
    for (int i = 0; i != 3; ++i)
        cout << __COUNTER__ << endl;
    }

/* run-time result:
0
0
0
*/

Titillating
Three new macros relate to an enclosing function's name or signature:

__FUNCTION__ expands to the function's name.
__FUNCSIG__ expands to the function's friendly (human-readable) signature.
__FUNCDNAME__ expands to the function's decorated (software-readable) signature.
Here's a somewhat elaborate example of each:

#include <iostream>
using namespace std;

typedef int INT;

template<typename T>
INT const volatile __cdecl abc()
    {
    cout << __FUNCTION__ << endl;
    cout << __FUNCSIG__ << endl;
    cout << __FUNCDNAME__ << endl;
    return 0;
    }

int main()
    {
    abc<int const volatile>();
    }

When run, this program produces

abc<int const volatile >
volatile const int __cdecl abc<int const volatile >(void)
?abc@?$@$$CDH@@YA?DHXZ

Note that the original source's appearance is not perfectly preserved in the friendly signature:

() is replaced with (void).
INT is replaced with the type it aliases (int).
The return type's CV-qualifier sequence const volatile is reordered as volatile const.
Those CV-qualifiers appear after int in the return type.
The same sequence (const volatile int) appears as both the return type and the template argument in the original source. Oddly, the macro expansion scrambles the return-type order, but preserves the template-argument order.

Because the template argument can't be deduced, the template specialization must be explicit. If you rewrite the template so that the argument is deducible:

#include <iostream>
using namespace std;

typedef int INT;

template<typename T>
INT const volatile __cdecl abc(T = T())
    {
    cout << __FUNCTION__ << endl;
    cout << __FUNCSIG__ << endl;
    cout << __FUNCDNAME__ << endl;
    return 0;
    }

int main()
    {
    abc<int const volatile>();
    }

the results change:

abc
volatile const int __cdecl abc(volatile const int)
?abc@@YA?DHH@Z

Unsurprisingly, the function parameter is no longer void. But quite surprisingly, the function name is now just the template name abc, rather than the actual specialization abc<const volatile int>. The macros act as if abc is an ordinary (non-template) function overload.

The macro troubles evidence a more fundamental flaw: the compiler and linker conspiring to incorrectly decorate certain kinds of function signatures. While I don't show it here, this problem affects Visual C++'s standard conformance. And since we now care about such conformance, presumably we'll fix the flaw in a future Visual C++ release.

Sundries:

The macros must appear in the scope of a function body.
While the documentation is silent on this point, my testing shows each expanded value to have type char [N] rather than the char const [N] you might expect.
The macros are not expanded in preprocessed output generated by compiler options /E, /EP, and /P.
Exceptions
Visual C++ .NET supports a funky extension to normal exception-specification syntax:

void f() throw(...)
    {
    // ...
    }

According to the documentation:

throw(...) tells the compiler that a function could throw an exception. This is useful if you want your functions to explicitly specify whether or not they will throw exceptions.
From what I can tell, there's no semantic difference between the Microsoft-specific

void f() throw(...)

and the standard-conformant

void f()

I think the benefit (such as it is) lies in the syntax: By explicitly tagging a function with throw(...), you can communicate intention more clearly. Used diligently and deliberately, void f throw(...) can say to users "I've analyzed this function's exception characteristics, and have found that the function might throw, although I can't yet guarantee what it could throw." void f() could say that, yes; but it could also say "I haven't analyzed this function's exception characteristics," or "this function was originally written in C," or "this function calls into code written in another language."

If you adopt Microsoft's throw extension, you can tag every function with an exception specification:

void f1() throw();       // f1 throws nothing
void f2() throw(...);    // f2 may throw an unspecified exception
void f3() throw(T1, T2); // f3 may throw exception of type T1 or T2

Counterpoint
Some prominent colleagues believe that exception specifications are either a good idea implemented badly, or just a bad idea. Perhaps the most significant critic is Herb Sutter, who's researched and written about C++ exceptions more than anyone else I know.

Much of the trouble comes from the lack of static type checking on exception specifications:

void f() throw() // f claims to throw nothing, but...
    {
    throw 1;     // the compiler allows this anyway
    }

The specifications are checked only at run time; and even then the consequence is typically program termination, albeit with one last chance at program cleanup. Specifications can also incur a non-trivial speed or space cost at run time, especially if used indiscriminately. (To be fair: That's true as well for other C++ aspects, such as copy construction and template instantiation, yet we don't use that argument to justify wholesale avoidance of these features.)

I expect that specifications will be stay dynamically checked as a general rule, if only to allow interoperation with code lacking specifications. At best, I can envision two kinds of specifications in the future: those checked at run time much as they are today, and a separate set checked at compile time, somewhat like Java's. The latter would require a change to the C++ standard; you can find one such proposed change here.

Conversely, I won't be surprised if the C++ committee takes a different tack: voting exception specifications out of the standard, or at least tagging them as deprecated. Not every one of the committee's inventions has been a success; maybe we should admit exception specifications are a failed experiment and move on.

Attributes
As attributes warrant their own column, I show them only briefly here.

Here's a partial native example adapted from the Visual C++ docs:

#include <windows.h>

[module(name="MyLibrary")]
[uuid("2F5F63F1-16DA-11d2-9E7B-00C04FB926DA"), dual]
__interface IStatic::IDispatch
    {
    // ...
    HRESULT P1([in] long);    
    };

[cpp_quote("#include file.h")];

int main()
    {
    }

All of the bits in [] are attributes. Because standard C++ doesn't use such syntax, code outside the attributes is ignorant of and not directly affected by the pieces inside. This mechanism cleanly and obviously separates the extended features from the standard ones.

The nature and meaning of attributes depends on their context:

In native code, attributes enhance interaction with COM and type libraries, and generalize a similar concept from IDL and ODL. They are specific to Visual C++.
In managed code, custom attributes extend metadata, which is the CLR analogue of type libraries. They are available in all CLR-targeting languages, not just Visual C++.
Coda
Next time I'll summarize significant standard-conformant language features still missing in Visual C++ .NET.

'KB > C/C++' 카테고리의 다른 글

do{}while(0)를 하는 이유  (0) 2007.05.25
C와 어셈블리 호출  (0) 2006.08.15
cl.exe Episode XIII: Attack of the Standards: VC6, 7에서 C++ 표준 적용 사항  (0) 2006.06.28
typeof  (0) 2006.06.28
code profiler  (0) 2006.04.17

+ Recent posts