macaw-base
Safe HaskellNone
LanguageHaskell2010

Data.Macaw.Discovery.State

Description

This defines the data structures for storing information learned from code discovery. The DiscoveryState is the main data structure representing this information.

Synopsis

DiscoveryState

data DiscoveryState arch Source #

Information discovered about the program

type AddrSymMap (w :: Nat) = Map (MemSegmentOff w) ByteString Source #

Maps code addresses to the associated symbol name if any.

exploredFunctions :: DiscoveryState arch -> [Some (DiscoveryFunInfo arch)] Source #

Return list of all functions discovered so far.

emptyDiscoveryState Source #

Arguments

:: Memory (ArchAddrWidth arch)

State of memory

-> AddrSymMap (ArchAddrWidth arch)

Map from addresses to their symbol name (if any)

-> ArchitectureInfo arch

architecture/OS specific information

-> DiscoveryState arch 

Create empty discovery information.

memory :: DiscoveryState arch -> Memory (ArchAddrWidth arch) Source #

The initial memory when disassembly started.

symbolNames :: DiscoveryState arch -> AddrSymMap (ArchAddrWidth arch) Source #

Map addresses to known symbol names

archInfo :: DiscoveryState arch -> ArchitectureInfo arch Source #

Architecture-specific information needed for discovery.

data GlobalDataInfo w Source #

Information about a region of memory.

Constructors

JumpTable !(Maybe w)

A jump table that appears to end just before the given address.

ReferencedValue

A value that appears in the program text.

Instances

Instances details
(Integral w, Show w) => Show (GlobalDataInfo w) Source # 
Instance details

Defined in Data.Macaw.Discovery.State

globalDataMap :: forall arch f. Functor f => (Map (ArchMemAddr arch) (GlobalDataInfo (ArchMemAddr arch)) -> f (Map (ArchMemAddr arch) (GlobalDataInfo (ArchMemAddr arch)))) -> DiscoveryState arch -> f (DiscoveryState arch) Source #

Map each jump table start to the address just after the end.

funInfo :: forall arch f. Functor f => (Map (ArchSegmentOff arch) (Some (DiscoveryFunInfo arch)) -> f (Map (ArchSegmentOff arch) (Some (DiscoveryFunInfo arch)))) -> DiscoveryState arch -> f (DiscoveryState arch) Source #

Get information for specific functions

unexploredFunctions :: forall arch f. Functor f => (UnexploredFunctionMap arch -> f (UnexploredFunctionMap arch)) -> DiscoveryState arch -> f (DiscoveryState arch) Source #

List of functions to explore next.

data NoReturnFunStatus Source #

Flags whether a function is labeled no return or not.

Constructors

NoReturnFun

Function labeled no return

MayReturnFun

Function may retun

Instances

Instances details
Show NoReturnFunStatus Source # 
Instance details

Defined in Data.Macaw.Architecture.Info

Pretty NoReturnFunStatus Source # 
Instance details

Defined in Data.Macaw.Architecture.Info

Methods

pretty :: NoReturnFunStatus -> Doc ann

prettyList :: [NoReturnFunStatus] -> Doc ann

trustedFunctionEntryPoints :: forall arch f. Functor f => (Map (ArchSegmentOff arch) NoReturnFunStatus -> f (Map (ArchSegmentOff arch) NoReturnFunStatus)) -> DiscoveryState arch -> f (DiscoveryState arch) Source #

Retrieves functions that are trusted entry points.

exploreFnPred :: forall arch f. Functor f => ((ArchSegmentOff arch -> Bool) -> f (ArchSegmentOff arch -> Bool)) -> DiscoveryState arch -> f (DiscoveryState arch) Source #

DiscoveryFunInfo

data DiscoveryFunInfo arch ids Source #

Information discovered about a particular function

Constructors

DiscoveryFunInfo 

Fields

Instances

Instances details
ArchConstraints arch => Pretty (DiscoveryFunInfo arch ids) Source # 
Instance details

Defined in Data.Macaw.Discovery.State

Methods

pretty :: DiscoveryFunInfo arch ids -> Doc ann

prettyList :: [DiscoveryFunInfo arch ids] -> Doc ann

discoveredFunName :: MemWidth (ArchAddrWidth arch) => DiscoveryFunInfo arch ids -> ByteString Source #

Returns the "name" associated with a function.

This is either the symbol or the address.

parsedBlocks :: forall arch ids f. Functor f => (Map (ArchSegmentOff arch) (ParsedBlock arch ids) -> f (Map (ArchSegmentOff arch) (ParsedBlock arch ids))) -> DiscoveryFunInfo arch ids -> f (DiscoveryFunInfo arch ids) Source #

Parsed block

data ParsedBlock arch ids Source #

A contiguous region of instructions in memory.

Constructors

ParsedBlock 

Fields

Instances

Instances details
(ArchConstraints arch, Show (ArchBlockPrecond arch)) => Show (ParsedBlock arch ids) Source # 
Instance details

Defined in Data.Macaw.Discovery.ParsedContents

Methods

showsPrec :: Int -> ParsedBlock arch ids -> ShowS #

show :: ParsedBlock arch ids -> String #

showList :: [ParsedBlock arch ids] -> ShowS #

ArchConstraints arch => Pretty (ParsedBlock arch ids) Source # 
Instance details

Defined in Data.Macaw.Discovery.ParsedContents

Methods

pretty :: ParsedBlock arch ids -> Doc ann

prettyList :: [ParsedBlock arch ids] -> Doc ann

Block terminal statements

data ParsedTermStmt arch ids Source #

This term statement is used to describe higher level expressions of how block ending with a a FetchAndExecute statement should be interpreted.

Constructors

ParsedCall !(RegState (ArchReg arch) (Value arch ids)) !(Maybe (ArchSegmentOff arch))

A call with the current register values and location to return to or Nothing if this is a tail call.

Note that the semantics of this instruction assume that the program has already stored the return address in the appropriate location (which depends on the ABI). For example on X86_64 this is the top of the stack while on ARM this is the link register.

PLTStub !(MapF (ArchReg arch) (Value arch ids)) !(ArchSegmentOff arch) !VersionedSymbol

PLTStub regs addr sym symVer denotes a terminal statement that has been identified as a PLT stub for jumping to the given symbol (with optional version information).

This is a special case of a tail call. It has been added separately because it occurs frequently in dynamically linked code, and we can use this to recognize PLT stubs.

The first argument maps registers that were changed to their value. Other registers have the initial value. This should typically be empty on X86_64 PLT stubs.

The second argument is the address in the .GOT that the target function is stored at. The PLT stub sets the PC to the address stored here.

The third and fourth arguments are used to resolve where the function should jump to.

ParsedJump !(RegState (ArchReg arch) (Value arch ids)) !(ArchSegmentOff arch)

A jump to an explicit address within a function.

ParsedBranch !(RegState (ArchReg arch) (Value arch ids)) !(Value arch ids BoolType) !(ArchSegmentOff arch) !(ArchSegmentOff arch)

ParsedBranch regs cond trueAddr falseAddr represents a conditional branch that jumps to trueAddr if cond is true and falseAddr otherwise.

The value assigned to the IP in regs should reflect this if-then-else structure.

ParsedLookupTable !(JumpTableLayout arch) !(RegState (ArchReg arch) (Value arch ids)) !(ArchAddrValue arch ids) !(Vector (ArchSegmentOff arch))

A lookup table that branches to one of a vector of addresses.

The registers store the registers, the value contains the index to jump to, and the possible addresses as a table. If the index (when interpreted as an unsigned number) is larger than the number of entries in the vector, then the result is undefined.

ParsedReturn !(RegState (ArchReg arch) (Value arch ids))

A return with the given registers.

ParsedArchTermStmt !(ArchTermStmt arch (Value arch ids)) !(RegState (ArchReg arch) (Value arch ids)) !(Maybe (ArchSegmentOff arch))

An architecture-specific statement with the registers prior to execution, and the given next control flow address.

ParsedTranslateError !Text

An error occured in translating the block

ClassifyFailure !(RegState (ArchReg arch) (Value arch ids)) [String]

The classifier failed to identity the block. Includes registers with list of reasons for each classifer to fail

Instances

Instances details
ArchConstraints arch => Show (ParsedTermStmt arch ids) Source # 
Instance details

Defined in Data.Macaw.Discovery.ParsedContents

Methods

showsPrec :: Int -> ParsedTermStmt arch ids -> ShowS #

show :: ParsedTermStmt arch ids -> String #

showList :: [ParsedTermStmt arch ids] -> ShowS #

parsedTermSucc :: ParsedTermStmt arch ids -> [ArchSegmentOff arch] Source #

Get all successor blocks for the given list of statements.

JumpTableLayout

data JumpTableLayout arch Source #

This describes the layout of a jump table. Beware: on some architectures, after reading from the jump table, the resulting addresses must be aligned. See the IPAlignment class.

Constructors

AbsoluteJumpTable !(BoundedMemArray arch (BVType (ArchAddrWidth arch)))

AbsoluteJumpTable r describes a jump table where the jump target is directly stored in the array read r.

RelativeJumpTable !(ArchSegmentOff arch) !(BoundedMemArray arch (BVType w)) !(Extension w)

RelativeJumpTable base read ext describes information about a jump table where all jump targets are relative to a fixed base address.

The value is computed as baseVal + readVal where

baseVal = fromMaybe 0 base, readVal is the value stored at the memory read described by read with the sign of ext.

Instances

Instances details
RegisterInfo (ArchReg arch) => Show (JumpTableLayout arch) Source # 
Instance details

Defined in Data.Macaw.Discovery.ParsedContents

data Extension (w :: Natural) Source #

Information about a value that is the signed or unsigned extension of another value.

This is used for jump tables, and only supports widths that are in memory

Constructors

Extension 

Fields

Instances

Instances details
Show (Extension w) Source # 
Instance details

Defined in Data.Macaw.Discovery.ParsedContents

jtlBackingAddr :: JumpTableLayout arch -> ArchSegmentOff arch Source #

Return base address of table storing contents of jump table.

jtlBackingSize :: JumpTableLayout arch -> Word64 Source #

Returns the number of bytes in the layout

BoundedMemArray

data BoundedMemArray arch (tp :: Type) Source #

This describes a region of memory dereferenced in some array read.

These regions may be be sparse, given an index i, the the address given by arBase + arIx*arStride.

Constructors

BoundedMemArray 

Fields

  • arBase :: !(ArchSegmentOff arch)

    The base address for array accesses.

  • arStride :: !Word64

    Space between elements of the array.

    This will typically be the number of bytes denoted by arEltType, but may be larger for sparse arrays. matchBoundedMemArray will fail if stride is less than the number of bytes read.

  • arEltType :: !(MemRepr tp)

    Resolved type of elements in this array.

  • arSlices :: !(Vector [MemChunk (ArchAddrWidth arch)])

    The slices of memory in the array.

    The ith element in the vector corresponds to the first size bytes at address `base + stride * i`.

    The number of elements is the length of the array.

    N.B. With the size could be computed from the previous fields, but we check we can create it when creating the array read, so we store it to avoid recomputing it.

Instances

Instances details
RegisterInfo (ArchReg arch) => Show (BoundedMemArray arch tp) Source # 
Instance details

Defined in Data.Macaw.Discovery.ParsedContents

Methods

showsPrec :: Int -> BoundedMemArray arch tp -> ShowS #

show :: BoundedMemArray arch tp -> String #

showList :: [BoundedMemArray arch tp] -> ShowS #

arByteCount :: forall arch (tp :: Type). BoundedMemArray arch tp -> Word64 Source #

Return number of bytes used by this array.

isReadOnlyBoundedMemArray :: forall arch (tp :: Type). BoundedMemArray arch tp -> Bool Source #

Return true if the address stored is readable and not writable.

Reasons for exploring

data FunctionExploreReason (w :: Nat) Source #

This describes why we started exploring a given function.

Constructors

PossibleWriteEntry !(MemSegmentOff w)

Exploring because code at the given block writes it to memory.

CallTarget !(MemSegmentOff w)

Exploring because address terminates with a call that jumps here.

InitAddr

Identified as an entry point from initial information

CodePointerInMem !(MemSegmentOff w)

A code pointer that was stored at the given address.

UserRequest

The user requested that we analyze this address as a function.

ppFunReason :: forall (w :: Nat). FunctionExploreReason w -> String Source #

Print exploration reason.

data BlockExploreReason (w :: Nat) Source #

This describes why we are exploring a given block within a function.

Constructors

NextIP !(MemSegmentOff w)

Exploring because the given block jumps here.

FunctionEntryPoint

Identified as an entry point from initial information

SplitAt !(MemSegmentOff w) !(BlockExploreReason w)

Added because the address split this block after it had been disassembled. Also includes the reason we thought the block should be there before we split it.

DiscoveryState utilities

type RegConstraint (r :: Type -> Type) = (OrdF r, HasRepr r TypeRepr, RegisterInfo r, ShowF r) Source #

Constraint on architecture register values needed by code exploration.