Saved searches

Use saved searches to filter your results more quickly

Cancel Create saved search Sign up Reseting focus

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Relative Instruction API #443

stevemk14ebr opened this issue Aug 6, 2015 · 19 comments

Expose Relative Instruction API #443

stevemk14ebr opened this issue Aug 6, 2015 · 19 comments

Comments

Contributor stevemk14ebr commented Aug 6, 2015

Feature Request:
Many modern disassemblers expose somewhere in their api a way to determine if a current instruction is relative to EIP/RIP in some manner. The capstone api only exposes the Opcode Types of MEM,IMM,REG, and FP. When writing code relocation utilities such as hooking libraries this is a significant drawback as it is currently impossible to determine if an instruction is relative or not and then modify that displacement if necessary.

This could be resolved by exposing two new features to the api:

A flag of some sort defining if the currently instruction is RIP/EIP Relative
An integer value describing the offset in bytes from the beggining of the instruction to the Displacement.

jmp [rip+0xDEADBEEF] "\xFF\x25\xEF\xBE\xAD\xDE"

cs_insn* CurIns = (cs_insn*)&Instructions[i]; if(CurIns->Flag & X86_INS_REL) < Displacement=CurIns->Relative.Displacement; //would be 0xdeadbeef in this example OffsetToDisp=CurIns->Relative.Offset; //would be 2 in this example >

The text was updated successfully, but these errors were encountered:

Contributor Author stevemk14ebr commented Aug 7, 2015

Unfortunately (2) is extremely important for my particular use case. I'm writing a hooking library and it has to fixup those copied relative bytes, if i can't write to the displacement (CurIns->Address+CurIns->DispOffset) then i can't do any fixing up. I also cannot find the file at the path you specified.

Perhaps this offset feature could be retrieved using just a function, instead of modifying the cs_ins struct (similar to cs_reg_name)

Collaborator aquynh commented Aug 7, 2015

but which API can be use to retrieve this "offset"? it does not fit any current API, as far as i can see.

Contributor hlide commented Aug 7, 2015

There is a way, but it is a hack of course. After opcode bytes you may have an optional displacement coded in 1, 2 or 4 bytes, then followed by an optional immediate coded in 1, 2 or 4 bytes.

| opcode bytes | displacement | EoI +--------------+--------------+ offset = insn_size - 1/2/4 | opcode bytes | immediate | EoI +--------------+-----------+ offset = insn_size - 1 / 2 / 4 | opcode bytes | displacement | immediate | EoI +--------------+--------------+-----------+ offset1 = insn_size - 1 / 2 / 4 offset2 = offset1 - 1 / 2 / 4

so if we know if there is an immediate/displacement and which size they are, you can retrieve their offsets. The thing is, do we have their size as they are encoded?

Contributor Author stevemk14ebr commented Aug 7, 2015

hlide i tried doing exactly that by looping the opcodes but their size members don't seem to have any meaning. Would it make sense to implement this in the detail structure for x86, so that the interface would be:

cs_detail* detail=CurIns->detail; cs_x86::Displacement=detail->x86.Displacement; //Change the current.disp member to a struct Displacement.value; //Get the value Displacement.offset; //Get the offset

Add a Miscellaneous struct inside cs_x86, would contain all the modrm/displacement offsets.

Contributor hlide commented Aug 7, 2015

struct offsets_and_sizes < . >prefixes.offset // offset where the first prefix starts from instruction address prefixes.size // self-speaking opcode.offset // offset where the 1/2/3-byte opcode starts from instruction address opcode.size // size of 1/2/3-byte opcode modrm.offset // offset where the modrm starts from instruction address modrm.size // 1 if modrm exists or 0 sib.offset // offset where the sib starts from instruction address sib.size // 1 if sib exists or 0 displacement.offset // offset where the displacement starts from instruction address displacement.size // size of 1/2/4/8-byte displacement immediate.offset // offset where the immediate starts from instruction address immediate.size // size of 1/2/4/8-byte immediate

with AVX, I believe you can have a supplementary byte which ends the instruction.
As for AVX3.x (aka AVX-512), I don't know if there supplementary bytes after or just before through prefixes.

Contributor Author stevemk14ebr commented Aug 7, 2015

Something like that would be perfect

Contributor Author stevemk14ebr commented Aug 7, 2015

I've begun implementing the api proposed by hlide in pull #444. I don't have enough knowledge of all the cases and platforms to implement this for anything other than x86, i would appreciate if others could help pick this one up.

mtivadar commented Sep 10, 2015

what about feature (1), you suggested a group like CS_GRP_BRANCH_REL , would it be possible to do it as stevemk14ebr suggested, to have some sort of flag(or group) like X86_INS_REL ? This would be necessary to instructions like "lea rdx, qword ptr [rip + disp]" not only "jmp [rip + disp]"