Skip to content
forked from k0kubun/hescape

C library for fast HTML escape using SSE instructions

License

Notifications You must be signed in to change notification settings

GerHobbelt/hescape

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hescape Build Status

C library for fast HTML escape using SSE instruction, pcmpestri. For ruby, you can use this via hescape gem.

API

Hescape provides only one API, hesc_escape_html. It may change at any time since this project is experimental for now.

size_t hesc_escape_html(uint8_t **dest, const uint8_t *src, size_t size);

Given src and size, it stores a pointer for the escaped result to dest and returns its size. Note that hesc_escape_html expects src to be UTF-8 string and it allocates new memory only when src has characters to be escaped. You need to free dest when return value is larger than size.

It escapes src with the following rules.

  " --> "
  & --> &
  ' --> '
  < --> &lt;
  > --> &gt;

It's designed to be the same as CGI.escapeHTML in Ruby.

Benchmark

See the result of this benchmark.

==============================================
 no escape (N=300000)
==============================================
hescape: 0.065899 s (4552446.6 i/s)
houdini: 0.169504 s (1769864.6 i/s)

hescape is 2.57x faster

==============================================
 10% escape (N=80000)
==============================================
hescape: 0.184101 s (434543.7 i/s)
houdini: 0.309255 s (258686.3 i/s)

hescape is 1.68x faster

==============================================
 all escape (N=20000)
==============================================
hescape: 0.297408 s (67247.8 i/s)
houdini: 0.296916 s (67359.0 i/s)

hescape is 1.00x faster

==============================================
 wikipedia table (N=10000)
==============================================
hescape: 0.313769 s (31870.6 i/s)
houdini: 0.468118 s (21362.1 i/s)

hescape is 1.49x faster

Usage

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "hescape.h"

int main(void) {
  uint8_t *dest, *src = "<>\"'&";
  size_t len = hesc_escape_html(&dest, src, strlen(src));
  printf("%s => %s\n", src, dest);
  if (len > strlen(src)) {
    free(dest);
  }
  return 0;
}

// <>"'& => &lt;&gt;&quot;&#39;&amp;

Note

Many ideas except pcmpestri are originally from vmg/houdini.

License

MIT License

About

C library for fast HTML escape using SSE instructions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 90.6%
  • Makefile 9.4%