-
Notifications
You must be signed in to change notification settings - Fork 1
/
TODO
34 lines (27 loc) · 919 Bytes
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
*** Severe
* NullPointerException when klicking on Reuter's 'next page'
*** Important
* React smart on encountering html-entities
* Improve tagger:
- demonize
- test mulit-language stuff
*** Bugs
* Don't screw up on html tags with < and > entities instead of < and >.
* WERTi.java does not respect this.server; line 179
* does the HTML annotation really need the tag-name field?
*** Wishlist
************
- put the lingpipe sentence boundary detector to use
(we'd have to re-implement its API for this)
- implement HTML parser [0]
- Implement TreeTagger
- Use HTML entities in sentence boundary followers (like, for " or
similar)
- Internationalize the exceptions, logging, whole application following
[1]
- Interactive Passivator
- Make up a RadioGroup for GWT
References:
===========
[0] http://htmlparser.sourceforge.net/
[1] http://java.sun.com/j2se/1.5.0/docs/guide/intl/index.html