Does the censor list use regular expressions?
Asterelle - Sanctuary_1381265973
Posts: 7,881 Arc User
If these are regular expressions can you please add the word break character '\b' to the start and end of every english entry in the list?
http://www.regular-expressions.info/wordboundaries.html
Seeing a word like "Circumstance" turn into "Cir****tance" is very sloppy.
At the moment it doesn't matter that the list is too big, it's not even behaving properly.
http://www.regular-expressions.info/wordboundaries.html
Seeing a word like "Circumstance" turn into "Cir****tance" is very sloppy.
At the moment it doesn't matter that the list is too big, it's not even behaving properly.
[SIGPIC][/SIGPIC]
Refining Simulator - aster.ohmydays.net/pw/refiningsimulator.html (don't use IE)
Genie Calculator - aster.ohmydays.net/pw/geniecalculator.html - (don't use IE)
Socket Calculator - aster.ohmydays.net/pw/socketcalculator.html
Refining Simulator - aster.ohmydays.net/pw/refiningsimulator.html (don't use IE)
Genie Calculator - aster.ohmydays.net/pw/geniecalculator.html - (don't use IE)
Socket Calculator - aster.ohmydays.net/pw/socketcalculator.html
Post edited by Asterelle - Sanctuary_1381265973 on
0
Comments
-
I doubt it:
- There is no evidence that any entry on the list does anything other than a basic substring match (feel free to correct me if you know otherwise)
- Using a regex over a substring search would be significant extra effort on the part of the developers (but that's not unheard of - did you know that the announcement screen when you log in can render a subset of HTML, but the GMs only ever give it plain text?)
- Regex matches could, given the ridiculously large set of filters that are now present, quite easily become a noticeable performance issue given rapid chat and presumably several hundred patterns. Especially if they're blocking each chinese character individually (of course, a regex would make that much easier)
Regexes would've been the way to go, I imagine. Having word breaks on either side does make it much easier to bypass the filter, but arguably e.g. "assassin" really should bypass a filter (and does on the forum, clearly).0 -
they are blocking each letter/character individually. I've seen the list, all Japanese, Russian, Korean, Chinese, and other characters are blocked each letter at a time.I'm a guy, not a woman, that is all
"When you're on Team Bring it, every morning your feet hit the floor, the good lord says "good morning" and the devil says 'Oh **** they're up' " - Dwayne "The Rock" Johnson
Are you on Team Bring it?0 -
Kathikins - Dreamweaver wrote: »- Using a regex over a substring search would be significant extra effort on the part of the developers (but that's not unheard of - did you know that the announcement screen when you log in can render a subset of HTML, but the GMs only ever give it plain text?)
not really. the actual search is a call-out to a library function, and regex compilers/matchers are things you pull in from pre-written industry-standard libraries. i've done the equivalent, and it's not any serious extra amount of code to use regexes over case-insensitive substring matching.- Regex matches could, given the ridiculously large set of filters that are now present, quite easily become a noticeable performance issue given rapid chat and presumably several hundred patterns. Especially if they're blocking each chinese character individually (of course, a regex would make that much easier)
you'd be surprised, i think. the text to be filtered is an amount humans can more or less follow along with and read as it pops up --- or there'd be no point, after all. for computers that can redraw several thousand textured triangles on screen several times per second, running a (usually highly optimized) regex engine over so little text that us ugly sacks of mostly water can follow along in real time is an insignificant effort. decoding the MP3 files for the background music probably takes more CPU.
no, scratch that; the MP3 codec almost certainly takes VERY MUCH more CPU.
you'd load the badwords.txt file at client startup, pre-compile the regexes to state machines, and run those from cache. i dunno how large badwords.txt has become, now --- i should go look, probably --- but we'd likely be talking about a few megabytes of data all told. about what five minutes of MP3 takes up encoded, that would be.
of course, this is all me trying to be mr. sensible programmer. i'm not sure if there are any members of that species working for wanmei anymore...[SIGPIC][/SIGPIC] Heaven's Tear alts: KenLubin, Sou_Hon, JudyCaraco --- level 5x chars.0 -
they are using OGM, not MP3, the MP3s are external and not used by the game.
if you are gong to talk about something, please know what you're talking about..
all sounds are in sfx.pckI'm a guy, not a woman, that is all
"When you're on Team Bring it, every morning your feet hit the floor, the good lord says "good morning" and the devil says 'Oh **** they're up' " - Dwayne "The Rock" Johnson
Are you on Team Bring it?0 -
this so called "filter" fails on so many levels. It would not take a genius to correct all of this, but then again we are talking about PW b:surrender
And like Asty said, it is basically ridiculous. Simple everyday words have become Tr*****ment.
-.-[SIGPIC][/SIGPIC]
RoidAbuse is awesome, only he would sell his sperm for gear!!
"Toughest monster? ..... RedsRose b:surrender" - Kantorek
Where is my 1 v 1 Kan? b:mischievous0 -
RedsRose - Lost City wrote: »this so called "filter" fails on so many levels. It would not take a genius to correct all of this, but then again we are talking about PW b:surrender
And like Asty said, it is basically ridiculous. Simple everyday words have become Tr*****nt.
-.-
Fixed. And as there is currently an open topic on the issues with the current filter, please post these sorts of things there. Closed.0
This discussion has been closed.
Categories
- All Categories
- 181.9K PWI
- 697 Official Announcements
- 2 Rules of Conduct
- 264 Cabbage Patch Notes
- 61K General Discussion
- 1.5K Quality Corner
- 11.1K Suggestion Box
- 77.4K Archosaur City
- 3.5K Cash Shop Huddle
- 14.3K Server Symposium
- 18.1K Dungeons & Tactics
- 2K The Crafting Nook
- 4.9K Guild Banter
- 6.6K The Trading Post
- 28K Class Discussion
- 1.9K Arigora Colosseum
- 78 TW & Cross Server Battles
- 337 Nation Wars
- 8.2K Off-Topic Discussion
- 3.7K The Fanatics Forum
- 207 Screenshots and Videos
- 22.8K Support Desk