Isolating JavaScript in Dynamic Code Environments Execution Environments for Cloud Applications –...

18
Isolating JavaScript in Dynamic Code Environments Execution Environments for Cloud Applications – Spring 2011

Transcript of Isolating JavaScript in Dynamic Code Environments Execution Environments for Cloud Applications –...

Isolating JavaScript in Dynamic Code Environments

Execution Environments for Cloud Applications – Spring 2011

Background Modern web applications involve combining client-side

and server side technologies to generate dynamic content. (Example: PHP and JavaScript)

Different web frameworks handle different methods to do code mixing.

Identification of different levels of intermixing of programming languages is required.

Beneficial for XSS mitigation schemes, operations like code analysis, optimization and refactoring.

Key Points

Analyze the source code of web applications (phpBB, WordPress, phpMyAdmin and Drupal)

Identify the coding idioms for dynamic content generation and intermixing of languages (PHP, JavaScript)

Classify them into different classes.

Provide methods to reduce mixing in each of the classes.

Analysis Methodology Each web application’s code is processed on a

customized tool involving the below two parts. Part 1: Removal of PHP, HTML comments, HTML events

such as onclick, onload. Part 2: Randomization Process by parser.

If parser fails to randomize the code, intermixing is confirmed.

All the scripts are processed in the tool and the failures are noted.

Analysis Results Table The final column shows the number of scripts involving

code mixing. Total of 163 scripts out of 1000 are found to have

intermixed code.

Classifying coding idioms Manual investigation of 163 scripts done to identify five

cases of coding idioms. Case 1 :

Partial injection of non-mixed JavaScript source using the PHP built-in function echo()

Classifying coding idioms Case 2 : String concatenations

Single and double quotes are part of complex string concatenation operations.

The parser fails to randomize

Classifying coding idioms Case 3 :

The most frequent case of code intermixing Partial JavaScript code generation by PHP scripting blocks Parser fails to consume PHP code.

Classifying coding idioms Case 4 :

This case occurs only in phpBB JavaScript code generation by using frameworks’

meta languages Example

Classifying coding idioms Case 5 :

Markup injections Symbols like ‘&’ are processed as ‘&’ Example

Classification Results Table Most of the scripts fall in the third case The meta-language case, Case 4, occurs only in phpBB Cases 1 and 5 are limited. The dominant idioms are string concatenations, partial

injection using PHP scripting blocks.

Mixing reduction Done by altering the mixing code or extending the

parser to support individual cases.

Case 1 : Alternate coding preferred The programmer can inject the JavaScript code in the PHP

block.

Mixing reduction Case 2 :

Alternate coding preferred Mix reduction achieved by less use of quotes and

concatenation parts Example

Mixing reduction

Case 3 : Alternate coding and parser extension is done Parser identifies the PHP block and consumes it first. In case of failure after the above step, alternate coding. Example

Mixing reduction Case 4 :

Parser extended if substitution is simple. Alternate coding is done if otherwise.

Case 5 : Parser is extended to recognize HTML entities (like

&amp) and ignore them in syntax analysis.

Results after reduction Parser extensions and code rewriting manages to

strongly reduce intermixing. Results show that the reduction process minimizes

failing rates for Case 3 and Case 4.

Conclusion Over half a million of LoCs were processed.

1000 scripts were identified of which 163 scripts had PHP intermixed with JavaScript.

163 scripts were manually investigated to create a classification scheme of five distinct classes.

Techniques to minimize reduction were proposed.

Questions??

Comments!!