Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C Parser Does Not Parse A Simple Struct #1069

Closed
raymontag opened this issue Sep 25, 2019 · 8 comments
Closed

C Parser Does Not Parse A Simple Struct #1069

raymontag opened this issue Sep 25, 2019 · 8 comments
Assignees
Labels
Feature: C-Parser Type: Bug Something isn't working
Milestone

Comments

@raymontag
Copy link

Describe the bug
I tried recently to parse certain data structures from the XNU kernel. I encountered a simple struct that could not be parsed.

The struct is defined in a XNU header file. All data types are of type __uint32_t which is a typedef for unsigned int.

Interestingly, the parsing works if the type in the struct definition is changed to unsigned int. Therefore, I guess the parser cannot parse the type definition for some reason.

More interestingly, the parser recognizes the the __uint32_t as valid.

I tried this with 9.0.4 and 9.1-BETA.

Lines mentioned in the screenshot:

#line 35: "<redacted>/xnu-4903.241.1/osfmk/mach/arm/_structs.h"


struct arm_exception_state
{
	__uint32_t	exception; 
	__uint32_t	fsr; 
	__uint32_t	far; 
};

To Reproduce
Steps to reproduce the behavior:

  1. Create an empty parser profile
  2. Add the following configuration
-I<redacted>/xnu-4903.241.1/osfmk/
-I<redacted>/xnu-4903.241.1/libkern/
-I<redacted>/xnu-4903.241.1/iokit/
-I<redacted>/xnu-4903.241.1/bsd/
-I<redacted>/xnu-4903.241.1/EXTERNAL_HEADERS/
-D__arm64__
-DKERNEL=1
-DIOKIT
-D__GNUC__
  1. Parse the following header from XNU: xnu-4903.241.1/osfmk/mach/arm/_structs.h
    3.1. You need to download the whole XNU tree from https://opensource.apple.com/tarballs/xnu/xnu-4903.241.1.tar.gz
  2. Try to parse.

Expected behavior
The header is parsed and added to the data type library.

Screenshots
parsing_test

Environment (please complete the following information):

  • OS: Linux
  • Java Version: OpenJDK 11
  • Ghidra Version: 9.0.4 and 9.1-BETA
@ryanmkurtz ryanmkurtz added Feature: C-Parser Type: Bug Something isn't working labels Sep 25, 2019
@emteere
Copy link
Contributor

emteere commented Sep 25, 2019

If you look in the CParserPlugin.out file, what is around line 392, and is there an actual definition for __uinit32_t before that use?
If you can post your CparserPlugin.out file if it isn't too large? Be aware it may have pathnames in it. If the file is too large, can you post up to about line 500?

@raymontag
Copy link
Author

@emteere Sure, the line 392 is the last line of the following snippet ("};")

#line 35: "<redacted>/xnu-4903.241.1/osfmk/mach/arm/_structs.h"


struct arm_exception_state
{
	__uint32_t	exception; 
	__uint32_t	fsr; 
	__uint32_t	far; 
};

__uint32_t is defined before from bsd/arm/_types.h as could be seen in the CParserPlugin.out.

I attached here the CParserPlugin.out as well (< 500 lines actually):

CParserPlugin.out.txt

@0x6d696368
Copy link

0x6d696368 commented Sep 27, 2019

The problem seems to be that far is a keyword in the the C.jj grammar. So:

typedef unsigned int __uint32_t;

struct arm_exception_state
{
	__uint32_t	exception; 
	__uint32_t	fsr; 
	__uint32_t	far; 
};

fails.

But

typedef unsigned int __uint32_t;

struct arm_exception_state
{
	__uint32_t	exception; 
	__uint32_t	fsr; 
	__uint32_t	far__; 
};

parses successfully.


The CParser seems to need lots of work. I'm not sure that you can get away with one grammar. As different compiler extensions use different keywords and language extensions, which in other dialects are not keywords and can be used as identifiers.


EDIT: You can workaround the issue but adding -Dfar=far__ to your Parse Options.

@raymontag: If I have time I can try adding the XNU data types to my repo: https://github.com/0x6d696368/ghidra-data/tree/master/typeinfo
It won't be perfect but at least you would get some type information.
What headers do you need exactly? That is where in the source tree is the interesting stuff?

@raymontag
Copy link
Author

Thanks for the workaround, worked! Unfortunetaly there are many more quirks, probably because XNU uses a lot of GCC extensions.

@0x6d696368 At the moment, I try to parse everything from iokit/ (which involves most of the kernel code, though...)

@0x6d696368
Copy link

@0x6d696368 At the moment, I try to parse everything from iokit/ (which involves most of the kernel code, though...)

I actually already had the XNU source and now remember what the problem with it was back then: It is mostly C++ or ObjC. So can not be parsed with the CParser at all. You could parse the headers that declare extern "C". Not sure if that would be helpful.

@0x6d696368
Copy link

0x6d696368 commented Oct 5, 2019

@raymontag I recently learned you can populate a Data Type Archive from DWARF debug information. So you could try getting XNU DWARF info (e.g. https://github.com/argp/xnu cites it in the README) and generate the Data Type Archive this way ... even for C++ code.

EDIT: It is a bit backwards to first build the whole XNU kernel to extract the Data Type info, but it's the only workaround I could come up with. Hope that helps.

@raymontag
Copy link
Author

@0x6d696368 Thx, that sounds like a neat workaround!

@mgmchenry
Copy link

I would rate the CParserPlugin as nearly unusable. I was trying to generate headers for lua 5.3.5 and found whether I used VS or gcc for standard includes, it could not parse seemingly quite basic ifdef/end pairs (this is documented in other open issues for CParserPlugin)

I did find a workaround. I created a custom header file that when parsed with gcc generates a pre-parsed header file that ghidra can consume successfully.

gcc -E -dU -P -I".\src" -o luaTypes.gcc-E-dU.parsed.5.3.5.h luaTypes.gcc-E-dD.5.3.5.h

Creating the custom header file is a little like porting to a new platform. I eliminated as many dependencies and unneeded attributes with strategic defines:
#undef __stdcall
#undef __cdecl
#undef _stdcall
#undef _cdecl

#define __stdcall
#define __cdecl
#define _stdcall
#define _cdecl

typedef void * va_list; /* platform independnt va_list /
#define va_list /
to suppress va_list definition in stdarg.h /
#define __GNUC_VA_LIST /
to suppress __gnuc_va_list in stdarg.h /
#define _GCC_MAX_ALIGN_T /
to suppress max_align_t in stddef.h /
#define _INC__MINGW_H /
to suppress __debugbreak and __mingw_get_crt_info in _mingw.h /
/
#define _INC_CRTDEFS*/
#define __MINGW_EXTENSION
#define _CRT_ERRNO_DEFINED
#define _ERRCODE_DEFINED
#define _INC_STDDEF

#include <stdarg.h>
#undef GNUC
/* undefine GNUC after stdarg.h to prevent problematic defines in llimits.h and luaconf.h /
/
but must remain defined before stdarg.h is parsed */
#define _CRTIMP
#define _INC_STDIO

#define LUA_COMPAT_5_2

#include "lprefix.h"
#include "lua.h"
#include "lapi.h"
etc, etc, finally processing the target header files

This was a giant pain, but I was able to produce a type archive with exactly what I needed in it.

@emteere emteere self-assigned this Sep 28, 2021
@ryanmkurtz ryanmkurtz added this to the 10.0.5 milestone Oct 14, 2021
@ryanmkurtz ryanmkurtz modified the milestones: 10.0.5, 10.1 Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: C-Parser Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants