Inside the VB3 .EXE

by _Duke_ 
 
Introduction

The following essay is not intended to ba a "How to crack VB programs"essay but I will show you exactly HOW a VB program is protected from de-compilers.It is important that you have a working knowledge of programming usingVisual Basic in order to understand the essay and more importantly, tofollow the source code of the programs you de-compile. Although this essaycovers an older version of VB, there are many programs out there whichhave yet to be cracked. It also serves as a starting point to understandingthe later versions.

Tools required

1) Visual Basic 3- For Compiling our own test programs.

2) A Good Hex Editor- Use one that will let you Binary Comparetwo files and display the differences.

3) SoftIce- Of Course!

4) MAKE_MAK.EXE or DoDi's VBOPT to "Protect" our programs.

5) A VB DeCompiler. (Is there anything but DoDi's??!!)
 
Essay

As many people have discovered, a VB program isn't really a 'Program'in the traditional sense of the word. Visual Basic is an 'Interpreted'language. What this means is that the program is stored in a 'higher level'language than the machine's native code. It is the job of the interpreterto read back and execute this higher language AT RUNTIME. Most other languages(Such as C) are stored in native code and need nothing to translate them.In case you didn't know, the VB interpreter is VBRUN300.DLL (No wonderall the VB programs need it to run!!) This is the REAL program that isrunning. Any Softice breakpoints you set for the 'standard' Windows routineswill ALWAYS return you to VBRUN not to the EXE! The interpreter is readingthe contents of the EXE, translating the TOKENS, and executing varioussubroutines to perform the desired task. A VB program, therefore, cannotbe disassembled by the standard tools. Softice is pretty much useless hereunless you like to follow the spaghetti inside VBRUN. The program can howeverbe de-compiled back into VB source code thanks to DoDi's VBDIS. It is availableon the net as shareware but I STRONGLY recommend you get (and pay for,its worth it) the full version if you are serious about R-E'ing VB programs.This decompilation is possible thanks to Micro$oft's including informationin the executable that is not needed for the program to run. Now why wouldthey do that?

 The VB executable is made up of the same basic parts as otherwindows programs:

     DOS HEADER: This is provided for backward compatibilityof the EXE file format.

     STUB PROGRAM: Checks if Windows is running.Provides an error message if the
              program is being run from DOS.

     WINDOWS HEADER: This section provides importantinformation about the EXE to
              the operating system. Some of the more important locations are:

              OFFSET (hex)        FUNCTION
          ---------------------------------------------------------
              14        Initial value of CS:IP
              1C        Number of Segments
              22        Relative offset to SegmentTable (typ. 40)
              24        Relative offset to ResourceTable
              3E        The expected Windows version
***Note About Hex Editors: There seems to be a difference in opinionas to the 'START'
of a program. Some editors call the start byte 00, while others considerit byte 01.
If the addresses you are looking at just don't seem right, try shifting1 byte to the
right or left.

For a good reference on the Windows Header, look in the WIN SDK helpfile WIN31WH.HLP and look under "Executable-File Header Format"

A short VB program (1 form/module) will typically contain 4 entriesin it's segment table referencing 3 segments (one can be ignored). Oneof the segments, usually located just after the Windows Header, is a singleCALL instruction which transfers
control to the interpreter. THIS IS THE ONLY CODE IN THE VB PROGRAMTHAT RUNS!!!!!!  The other segments point to the Tokens themselvesand a section which specifies how the tokens are structured into the variousSubs and Functions.

Resources are 'packages' of data in a pre-defined format which a programwill access. Examples of resources are Icons, Fonts, and Menus. In a VBprogram, they are also used to reference Forms and other 'Data' sectionsof the program.
 
 
 

Some Hands On

** For this section of the lesson, you will need CALC.EXE compiled fromthe samples that come with VB3, it should compile to 9020 bytes. Or download it here within +Fravia's page.

Start your favorite Hex Editor and load CALC.EXE. Examine the followingsections as I describe them. I have found it easiest to print the wholefile in hex starting from the windows header and use colored markers tosee what the sections 'look' like.

    0000-003F DOS HEADER- Note the 06 @ 003D; This is the start page of the Windows Header.    0200-049F Stub Program- This code only runs from DOS.    0600-07FF Windows Header- Lets look more closely:          0614- Initial CS:IP = 10 00 01 00                This translates to 10 bytes past segment 1          061C- # Segments = 04 00          0622- Offset to Start of Segment Table = 40 00                Segment table starts @ 0640, segments are 4 words long               Segment 1 @ 0640 - 08 00 19 00 50 1D 19 00                     This means:                          The segment is located @ 0800                         The segment is 0019 bytes long                         1D50 - Flags (more later)                         The segment need 0019 bytes of memory               Segment 2 @ 0648 - 00 00 00 00 11 0C 02 00                     Ignore this segment definition               Segment 3 @ 0650 - 0F 00 50 02 10 1D 50 02                    This segment @ 0F00 is the 'Sub Structure Table'               Segment 4 @ 0658 - 09 00 D0 50 10 1C D0 50                    This segment @ 0900 is the Tokens (the 'Code')          0624- Offset to Resource Table = 68 00               Table starts @ 0668:               Word @ 0668 = 08 00 - This is rscAlignShift, ignore it for now
               First a resource's Type is defined, then all of the resources of that type follow:               First Type definition @ 066A - 0E 80 01 00 00 00 00 00                    This means:                         The TypeID is 800E (A Group Icon)                         There is 1 resource defined                    *The last 2 words are reserved
                Then the resc. is defined @ 0672 - 12 00 01 00 30 1C 01 80 00 00 00 00                    This means:                         The resource starts on Page 0012                         It is 0001 Pages long                         1C30 is more Flags                         The resource's ID is 8001                    *Again, the last two words are reserved               The next type definition is @ 067E - 03 80 01 00 00 00 00 00                     'There is 0001 resource of type 8003 (Icon)'               Then the resource definition @ 0686 - 13 00 03 00 30 1C 01 80 00 00 00 00                    The resource starts at page 0013 and is 3 pages long               The next type definition is @ 0692 - 0A 80 05 00 00 00 00 00                    'There are 0005 resources of type 800A (Data)'                    * There are actually 4 resources, the 3rd is skipped               Then the 4 definitions starting @ 069A:                    069A - 16 00 02 00 30 1C 01 80 00 00 00 00                       06A6 - 18 00 02 00 30 1C 02 80 00 00 00 00                    06B2 - 1A 00 09 00 30 1C 04 80 00 00 00 00                    06BE - 23 00 01 00 30 1C 05 80 00 00 00 00               These resources are respectively:                    Forms Definitions                    Internal Definitions                    A Form                    Form and Control Names               It should be noted that the resource ID is not related to what               the resource is used for. The function of the resource is                identified by it's header bytes.               The FLAGS sections of the segments and resources are used for               information like if they are MOVABLE, SHAREABLE, PRELOADED,                EXECUTEONLY, etc.     06D8-07FF Various name tables used by windows     0800-0819 This is the first segment. If you remember, the initial value               of CS:IP was 10 bytes past the start of this segment. This byte is                a long CALL (9A) into the interpreter. The address is computed at               runtime since there is no way to tell where VBRUN will load into               memory. The bytes which follow the segment are loading information               for other segments.     0900-0EFF These are the actual tokens. The source code is translated to this               at compile time. Strings are stored literally; this helps us to find               our place while comparing tokens to source. More on this section and               the ones that follow in the next lesson.     0F00-114F This section defines how the tokens are arranged into their various               subs.     1200-12FF This is the GROUP_ICON definition (Don't bother!)     1300-15FF This is the ICON definition. For information on this and the previous               section look in the WIN SDK help file under 'Graphics File Formats'     1600-17FF This is the Forms Definitions section. Here, information on forms,               imported VBX's, and controls is stored.     1800-19FF This section's format is quite mysterious but it is used to hold               object definitions like forms, controls, variables, and constants.     1A00-22FF This is the actual form used in the program. It's format is very similar               to a VB .FRM file. Notice the 'in line' icon @ 1A61. Pictures are also               stored this way. The form's controls are defined in the second half of               the form.     2300-END  These are the control names. ***This section is unnecessary for program                operation and is removed when the program is PROTECTED.***
What does this mean for CRACKERS???

Crackers can modify the information in these various sections to:

If you are lucky enough to own the Professional Version of DoDi's VB tools,you can De-compile most programs into source code which will recompilein Visual Basic after you have made your changes. Unfortunately, the sharewareand standard versions don't handle custom controls properly and will probablynot give you source code that will re-compile. Your only option is to makethe modifications manually.
 

Tokens

While a complete explanation of all of the tokens is beyond this lesson,I will describe some of the more common things you will come across asyou examine tokens. Lets take the following small snippet of code:Lets break this down. The first two bytes '35 49' is the token for encodingthe number of leading spaces in the original source code. Some of the tokenwords for spacing are as follows:Hence '35 49' means there are 4 spaces at the start of this line of code(four spaces is also the default TAB in VB). Although this informationis only for formatting and is not necessary for program operation, theinterpreter expects to see valid tokens here and funny things happen ifit doesn't. HINT: This makes it easy to find the start of each lineas you look at raw tokens.

'21 2D 1A 00'  References the variable password.

The next bytes '9A 38 0A 00 0C 00 04 00 64 75 6B 65 00 00' is the stringdefinition for 'duke' :'C3 11'  Follows most Literal String Definitions (performs a PUSHto prepare for next token)
 
 '7A 44' This is the important one. It basically means 'Comparethe two variables and if  < >  then continue' Hmmm. What wouldhappen if we changed '7A 44' to '6A 44' which means 'Compare the two variablesand if  =  then continue' You guessed it! Our program would becracked.

The remaining tokens are the END instruction and the next line spacingtoken. The easiest way to learn about the different tokens is to writeshort VB programs, make .EXE's, and compare the tokens with the sourcecode which generated them. When you compare the differences between simplecode changes, you will begin to see the patterns. You could also look atthe routines for the various tokens but these are very difficult to follow.If you would like to look at the routines, try the following:

This JMP AX is about to go to the first routine. If you look at AX, youwill notice that it's value is the first token. It was loaded with theprevious instruction LODSW ES. The tokens are actually the addresses ofthe routines to be performed! If you DB ES:0  you will seethe tokens the way VBRUN is referencing them using SI. As you step through,you can watch the tokens being loaded and their routines run.
Now that you have a basic understanding of how the tokens work, let'smove on....

Forms and Controls

Before I go into an actual Form, there is another resource which describesthe various forms and controls in the program. Lets take another sectionof CALC.EXE:
 
    1600:03 20 81 80 FF FF 43 41 4C 43 00 00 00 00 00 05    1610:00 01 00 43 41 4C 43 00 00 46 09 04 80 46 00 FF    1620:01 A4 48 00 43 41 4C 43 2E 46 52 4D 00 00 00 58    .....
This is the start of the 'Forms Definitions' section. It contains the namesof the form (.FRM) files used in the compile, names of any .VBX files neededby the program, and references for both common and custom controls.
 The start (header) of this section is '03 20 81 80'  andthere is only one of these sections in an .EXE (that I have seen, anyway).'FF FF' always follows.
 The next nine bytes contains the program name, eight byte DOSlimit and a terminating 0. If the name is less than 8 bytes long, the extraspace is padded w/ 0's.
 The next bytes '05 00' is the length of the Application Titlewith a terminating 0. This title may be up to H29 bytes long and is enteredat compile time.
 '01 00' ?????? Possibly the number of titles?
 '43 41 4C 43 00' is the title 'CALC'
The next '00' is padding.
The bytes which follow are the definitions of the names and controls.They have the following format:
 Now for the Form: CALC.EXE only has one form which starts at 1A00. Theform can be thought of as two sections:
the form description and the controls description.The form(s) will have the header 'FF CC 2C 00', (The header is actuallyjust CCFF but all of the forms I have seen follow this with 002C, it doesn'tseem to ever be checked. VBRUN300 will also accept CC23 as a valid formsection header although I have not yet seen it in an .EXE.)
This form has 7 controls,
Offset to end = 08A3 (End of form : 1A05 + 08A3 =  22A8),
Offset to first control = 038D (First control : 1A09 + 038D = 1D96)
I don't know what the 0D is....

The important thing to remember from this point is that VB starts witha default form and makes changes to it from there.
If our form were a default VB form, the 'FF 00' @ 1A16 would be locatedat 1A10; this FF designates the end of the basic window properties. Insteadwe have three entries:

As you can see, the changes are made by adding a 'Property ID' and thenew value. The three properties changed above all had values that wereonly one byte long but this is not always the case. The Form's captionis next, then starts property 05 @ 1A23 . The next 4 DWORDS are the valuesfor property 05 which set the size and location of the window. Changingthese will move and/or resize the window. The next property 0C, is a fontchange. This is followed by the font name and 5 words which describe it'sattributes. At 1A46 is property 23, an Icon. Icons, as well as images,are stored inline, in standard format and can be edited. The nextword 02FE is the offset to the end of the icon. Right after the icon, isproperty 24 @1D49; the Link Topic stored in the usual VB string format.And after that, property 25 which is the Link Mode. Last, and definitelyleast, are properties 35, 36, 37, and 38. Do you recognize their values?They are the same DWORDS as property 05 above. There is a difference though,changing these does nothing. I don't know if these are used for anythingat all. The next property is FF meaning of course 'No More Properties'.A few unknown bytes and we are on to our next part: the controls section(remember though, this is all one resource).
 The controls on this form start at 1D96. Since there are a lotof buttons on our 'Calculator', the section is too long to go over thewhole thing but here is a chunk of it:
  Lets look at the first control in detail:
  If our program contained a menu, the items would also be listed inthis faishon. The hierarchy can get quite messy but the key is in the firstbyte(s) of the control. The bytes following the first may or may not bethe offset to next, If they are 01 - 05 they are hierarchy codes (04 meaningno more controls). If they are > 5 then they are the offset to the nextcontrol. After you examine this section of a program with a complex menu,you will see what is going on.

 Unfortunately, due to the large number of control properties,I cannot give you a list of them. It is, however, fairly easy to find thecode of a property you are looking for .... Just compile a test programwith whatever control you are trying to find the property for, make an.EXE out of it, then change the property and make another .EXE. When youbinary compare the Form section's of the two programs, you will see whatbytes have been added to change the property. This is the best way to findout most of what is in the .EXE.

* A note on VB3: VB3 has a strange habit of compiling the exact samesource code into slightly different .EXE's between the first and secondcompiles. When making your reference file, compile your source code TWICEwithout changing anything. It will ask you if you want to over write theexisting .EXE; answer YES. NOW rename the .EXE and compile it a THIRD time.A binary compare of these should be identical, if not, repeat this untilyou can get two files which are identical. THEN make a your changes andCompile again. This is necessary on the VB3 that I have, you may want totest it on yours.

Here is a list of some of the Form Properties you may want to change:

and standard control types:

The last resource in our CALC program is the control names resource@ 2300. Not too much to talk about here, the first entry is the name ofthe form, subsequent entries are the names of the controls on the form.With a control array, only the first item is listed. This section is notneeded at all for the program to run and it can be removed (andis!) without effect. Each control defined in the control section has areference to the position in this list of the control's name. Unfortunately,the program's variable and sub/function names are not stored anywhere inthe program, and hence can never be recovered. If our program had morethan one form, the additional form(s) would follow alternating with theircontrol names section(s).
 

"Protection" from De-Compilers.

 First let me start by saying that NO PROGRAM CAN BE PROTECTEDFROM A GOOD DE-COMPILER!!!! This is not to say that DoDi'sDe-Compiler is not good, but he has written it with the intent to be ableto prevent it from working. As long as the Program Tokens are in the .EXE(and they must be for the program to function) those tokens can be de-compiledback to the original source code. So when I talk about Un-Protecting afile, what we are really talking about is making it acceptable to DoDi'sde-compiler.

 Programs are protected by removing the sections which are notneeded for the program to run, but ARE needed for the de-compiler. Thesesections are the .FRM names in the Forms definitions section and the ControlNames resource(s). Get MAKE_MAK.EXE from the net if you don't already haveit. It is a VB Protector. Make a copy of our CALC.EXE with the name CALC.OLDand using MAKE_MAK, 'Protect' CALC.EXE.

When you start to HEX examine the file, look at the following things:

DoDi's de-compiler will now refuse to work on our file. It is detectingthe changes and refusing to run (CHEAT!). But all we have to do is fixthese sections and it will work, right?? ABSOLUTELY!!! Since we know whatwas originally contained in these sections, we can rebuild them exactly.If we were dealing with someone else's protected file, we could only guessat what their forms and controls were named but IT WILL DE-COMPILE onceit is fixed.

Un-Protecting a file:

* If the program has been protected with DoDi's VBOPT, there will beone additional step needed in order to un-protect it. I won't tell youthis step out of respect for the writer of the only VB de-compiler I knowof, but it isn't hard to figure out. I have faith in all of you!!

 
Final Notes

 If there is something that I have not made clear, and after muchtime of trying to figure it out for yourself, or if you know of parts ofmy essay that are just wrong, please e-mail me at vbman@nassau.cv.net andI will do my best to help.
Please Don't email duke@nassau.cv.net, it's not me. Someone got theaddress before I did :(
Ob Duh

Ob duh does not apply here... on the countrary: visual basic buffs should pay Duke for this kind of information...  

BACK