Tiberiu Boros
Software Developer / Computer Scientist Adobe
BIOGRAPHY
Tiberiu Boros is a Ph.D. in computer science, specifically in the field of Text-to-Speech (TTS) Synthesis. He is currently working for Adobe Systems Romania and is an associate of the Research Institute for Artificial Intelligence of the Romanian Academy. Additionally, he maintains two Machine Learning open source projects (TTS-Cube and NLP-Cube) and is a contributor to the DyNet Machine Learning Framework (developed by Carnegie Mellon University and many others). His research is focused on applied Natural Language and Speech Processing.
Project SCOUT. Deep Learning for malicious code detection
The number of client-side attack vectors has increased dramatically in the last decade. From exploiting browser vulnerabilities to miners or drive-by downloads, attackers commonly use Javascript code to achieve their goals. In the past, malicious code classification has been achieved using standard feature-engineering over static code analysis or dynamic code execution patterns.
We propose a new deep-learning inspired methodology for detecting malicious code, based on latent representations computed in an un-supervised manner. We explore three different methodologies for computing the latent representations in a deep encoder-decoder architecture: self-attention, global style tokens (GST) and “memory-based” representations.
The three strategies for computing latent representations capture different aspects of how the code is written: (a) the GST tokens capture specific attacker techniques like code that is obfuscated or encrypted or that does many string manipulations; (b) the memory-based method learns “code patterns” such as iterators, if/else statements, asserts etc. and (c) the multi-head attention method captures on-the-fly summarizations of code-segments that are hard to reconstruct (don’t follow standard patterns).
1. The self-attention model represents code as the concatenated values of all heads in a multi-head attention system;
2. The GST method computes a probability distribution (attention) over a fixed number of style tokens (embeddings) and the latent representation is obtained as the weighted sum over all the tokens;
3. Finally, the memory-based method is similar to GST, but it computes multiple probability distributions over different buckets of style-tokens.
The latent code representations are used as input for a multilayer perceptron that classifies a code segment as being malicious or not. Our initial experiments on previously unseen data show state-of-the art results in classifying both isolated code-sequences as well as entire JS files as being malicious or benign.
The same latent-representation extraction methodology can be used over multiple datasets, regardless of the programming language, to attend a wide-variety of code-related tasks or problems as: identifying vulnerable code, identifying bad practices, indexing code (finding similar code), copyright issues, etc.
This talk is co-presented with Marius Manica, Cyber Incident Response at Adobe
Are you the next cyber security superstar?
If you are passionate about an information security topic or you have strong technical skills developing researches on your own, you should definitely Apply at Call for Papers. By submitting you will have the chance to showcase your work to +2000 attendees.
Other speakers joining this year
Lukas Štefanko
Malware Researcher ESET
Mircea Hasegan
Principal Software Engineer Ixia, a Keysight Business
Ready for this year's presentations?
By registering you will unlock access to 60+ speakers and two full days with cyber security news & showcases from worldwide leaders.
COMPETITIONS
Sponsors & Partners
They help us make this conference possible.
POWERED BY
Orange Romania is part of the Orange Group, one of the largest global telecommunications operators that connects hundreds of millions of customers worldwide. With over 11 million local customers and an annual turnover exceeding 1.5 billion euros, Orange Romania connects 1 in 2 Romanians and offers an extensive range of communication solutions for both individual and corporate customers, from basic connectivity services to complete mobile, fixed internet, TV packages, and complex IT&C solutions through Orange Business.
Orange Romania is the number 1 operator in terms of network performance, and also holds nine consecutive Top Employer certifications, which confirm that Orange Romania, in addition to the remarkable products and services it offers, pays special attention to its employees and working environment. In the past 3 years Orange has launched two 5G Labs in Bucharest and Iasi, that aim to support researchers, startups and companies to test their 5G solutions in advance.
In addition, Orange is a long-term supporter of the startup ecosystem through the Orange Fab accelerator program designed to support entrepreneurs in the development of innovative products and their distribution locally and internationally.