I'm quite sure we are thinking of different things. STIR is a binary randomization technique used to mitigate rop, and was developed in sitsu with the binary rewriting techniques. The technique of retaining the original code section is a failback to guard against errors in rewriting, but to my knowledge doesn't impose a performance penalty. Size required is a constant multiple, so I don't consider it an adoption hurdle. But everybody has different use scenarios. Right. Correctness is critical. I think co program proof methodologies with tools like coq will shine here in proofs to remove required trust that a rewrtten binary is conformant to certain execution properties. I hadn't know static rewriters even existed. I presume you are you talking about dynamic tools. On Oct 25, 2015 4:49 PM, "Shuai Wang" wrote: > Hello Kenneth, > > Yes, I agree Binary Stirring system can eliminate symbolization false > positive as well. Actually I believe > many research work, and tools (DynInst, for instance) have implemented > this a so-called "replica-based" > binary instrumentation framework. This is a quite robust way to > instrument binary code, although size expansion and > performance penalty cannot be ignored in the instrumentation outputs. > > However, I found those solutions are all quite complex, and difficult to > understand. And it might not be inaccurate > to assume "aggressive" instrumentation methods can break the functionality > due to the limitation of design, > or challenges in bug-less implementation. I even found that some > the-state-of-the-art binary instrumentation tools > cannot preserve the correct functionality when employing them to > instrument some SPEC2006 test cases. > > I personally would like to find some cleaner solutions, which can > introduce very little overhead in terms of binary > size and execution. Besides, some research work reveals that binary > security applications built on top of previous > instrumentation framework do leave certain exploitable vulnerabilities due > to the design limitations. > > Sincerely, > Shuai > > > > > > > On Sun, Oct 25, 2015 at 3:25 PM, Kenneth Adam Miller < > kennethadammiller@gmail.com> wrote: > >> Replied inline >> >> On Sun, Oct 25, 2015 at 3:04 PM, Shuai Wang >> wrote: >> >>> Hello Kenneth, >>> >>> Sorry for the late reply. I have several deadlines during this weekend. >>> >>> To answer your question, our current approach cannot ensure 100% >>> "reposition" correct. >>> The most challenging part is to identify code pointers in global data >>> sections, as we discussed >>> in our paper, it is quite difficult to handle even with some static >>> analysis techniques >>> (type inference, for instance). We do have some false positive, as shown >>> in the appendix of our paper [1]. >>> We will research more to eliminate the false positive. >>> >>> I believe it is doable to present a sound solution. It indeed requires >>> some additional >>> trampolines inserted in the binary code. You may refer to this paper for >>> some enlightens [2]. >>> >>> As for the disassembling challenges, we directly adopt a disassembly >>> approach proposed >>> by an excellent work [3]. You can check out their evaluation section, >>> and find that their approach >>> can correctly disassemble large-size applications without any error. My >>> experience is that Linux ELF >>> binaries are indeed easier to disassemble, and typical compilers (gcc; >>> icc; llvm) would not >>> insert data into code sections (the embedded data can trouble linear >>> disassembler a lot). >>> >>> >> I remember reading about [3] when it came out. That was a year after the >> original REINS system came out that proposed re-writing binaries, along >> with it's companion STIR. Shingled disassembly originated with Wartell et >> al.'s seminal Distinguishing Code and Data PhD thesis. I'm currently >> working on the integration of a sheering and PFSM enhanced Shingled >> Disassembler into BAP. But if you've already implemented something like >> that, what would be really valuable is if you were to review my shingled >> disassembler implementation and I review yours that way we have some cross >> review feedback. >> >> Regarding the need for 100% accuracy, in the REINS and STIR papers, the >> approach taken is to obtain very very high classification accuracy, but in >> the case that correctness cannot be established, to simply retain each >> interpretation of a byte sequence, so you are still correct in the instance >> that it's code by treating it as such. Then, a companion technique is >> introduced wherein the code section is retained in order that should such a >> data reference in code instance occur and interpretation was incorrect, >> such reference can read and write into the kept section. But if it's code, >> it has been rewritten in the new section. Then it should remain correct in >> any scenario. >> >> >>> However, if I am asked to work on PE binaries, then I will probably >>> start from IDA-Pro. >>> We consider the disassembling challenge is orthogonal to our research. >>> >> >> It is good to have good interoperabiblity with IDA as a guided >> disassembler and the actual new research tools. One of the most valuable >> things I can think of is to write some plugin that will mechanize data >> extraction as needed in order to accelerate manual intervention with the >> newer tools, such as in the case of training. >> >> >>> >>> IMHO, our research reveals the (important) fact that even though >>> theoretically relocation issue >>> is hard to solve with 100% accuracy, it might not be as troublesome as >>> it was assumed by previous work. >>> Simple solutions can achieve good results. >>> >> >> Agreed; there are failback stratagems. >> >> >>> >>> I hope it answers your questions, otherwise, please let me know :) >>> >>> Best, >>> Shuai >>> >>> [1] Shuai Wang, Pei Wang, Dinghao Wu, Reassembleable Disassembling. >>> [2] Zhui Deng, Xiangyu Zhang, Dongyan Xu, BISTRO: Binary Component >>> Extraction and Embedding for Software Security Applications >>> [3] Mingwei Zhang, Sekar, R, Control Flow Integrity for COTS Binaries. >>> >>> >>> >> There's a good utility for working with white papers and interacting with >> colleagues; mendeley. >> >> >>> >>> >>> >>> >>> >>> On Fri, Oct 23, 2015 at 6:31 PM, Kenneth Adam Miller < >>> kennethadammiller@gmail.com> wrote: >>> >>>> Well it's interesting that you've gone with a binary recompilation >>>> approach. How do you ensure that, statically, for any given edit, you >>>> reposition all the jump targets correctly? How do you deal with the >>>> difficulty of disassembly reducing to the halting problem? >>>> >>>> On Fri, Oct 23, 2015 at 4:59 PM, Shuai Wang >>>> wrote: >>>> >>>>> Hi guys, >>>>> >>>>> I am glad that you are interested in our work!! >>>>> >>>>> Actually this project starts over 1.5 years ago, and I believe at that >>>>> time, BAP (version 0.7 I believe?) is still a research prototype.. >>>>> >>>>> I choose to implement from the stretch is because I want to have a >>>>> nice tool for my own research projects, also I can have an opportunity >>>>> to learn OCaml... :) >>>>> >>>>> Yes, I definitely would like to unite our efforts!! >>>>> >>>>> Best, >>>>> Shuai >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Oct 23, 2015 at 1:30 PM, Ivan Gotovchits wrote: >>>>> >>>>>> Hi Shuai, >>>>>> >>>>>> Nice work! But I'm curious, why didn't you use [bap][1] as a >>>>>> disassembler? >>>>>> >>>>>> Do you know, that we have a low-level interface to disassembling, >>>>>> like [linear_sweep][2] or even >>>>>> lower [Disasm_expert.Basic][3] interface, that can disassemble on >>>>>> instruction level granularity. >>>>>> >>>>>> It will be very interesting, if we can unite our efforts. >>>>>> >>>>>> Best wishes, >>>>>> Ivan Gotovchits >>>>>> >>>>>> [1]: https://github.com/BinaryAnalysisPlatform/bap >>>>>> [2]: >>>>>> http://binaryanalysisplatform.github.io/bap/api/master/Bap.Std.html#VALlinear_sweep >>>>>> [3]: >>>>>> http://binaryanalysisplatform.github.io/bap/api/master/Bap.Std.Disasm_expert.Basic.html >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Oct 23, 2015 at 1:05 PM, Shuai Wang >>>>>> wrote: >>>>>> >>>>>>> Dear List, >>>>>>> >>>>>>> I’m glad to announce the first release of Uroboros: an infrastructure >>>>>>> for reassembleable disassembling and transformation. >>>>>>> >>>>>>> You can find the code here: https://github.com/s3team/uroboros >>>>>>> You can find our research paper which describes the core technique >>>>>>> implemented in Uroboros here: >>>>>>> >>>>>>> https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-wang-shuai.pdf >>>>>>> >>>>>>> We will provide a project home page, as well as more detailed >>>>>>> documents in the near future. Issues and pull requests welcomed. >>>>>>> >>>>>>> Happy hacking! >>>>>>> >>>>>>> Sincerely, >>>>>>> Shuai >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >