Public comment applications can occasionally generate feedback that should not be made publicly visible. Data entered into specific fields, such as personal information, can be hidden and protected using the pop-up configuration and layer security, but sometimes entire comments need to be hidden to avoid displaying sensitive or explicit content to other users.
Automated moderation of feedback builds on the concepts of manual moderation: The layer to be moderated has a filter applied in the map. Features can be hidden by updating the value of a field so that it no longer meets the filter requirements. The automated moderation script can be used to perform the scan for explicit or sensitive words and phrases and update the field attributes accordingly. This script can be scheduled to run as frequently as you want using Windows Task Scheduler, but only one instance of the task can be running at a single time.
Configuring and executing this script requires the following components:
- Python 2.7 (Installed with ArcMap)
Make sure that your Python IDE is set to work against your Python 2.7 installation (for example, <default directory>\Python27\ArcGIS10.6\python.exe), and then skip ahead to the steps for configuring the script.
To set up the automated moderation script, complete the following steps:
- Apply a filter to the map layer you would like to moderate. This filter should be set using a field that exists only for moderation purposes (automated or manual), and it should accept two values: one to indicate when the report should be visible and one to indicate when it should be hidden.
- Unzip the folder and use a Python IDE or text editor to open the wordlist.py file from the WordFinder folder.
- Provide the URL to your ArcGIS organization and the credentials to an account that can edit the reporting layer in the
orgURL
,username
, andpassword
parameters. These values should be surrounded by quotation marks. - List the service or services to scan in the
services
parameter by providing the following values for each service:'url'
: The URL to the REST endpoint of the service.'status field'
: Name of a field containing the status of the report. This value is optional. When a field name is specified, the script will only scan reports that have a specific value in this field. The specific value is defined in thestatus_value
parameter later in this configuration file.'flag field'
: Name of the field that is updated when explicit or sensitive content is found so that the report no longer meets the filter requirements. The script will only process features where the value of this field matches the value of thevisible_value
parameter defined below.'reason field'
: Name of a text field where the script will populate the reason a report was filtered. This value is optional. If no field name is specified, no reason will be provided.'fields to scan'
: List of field names to scan for explicit or sensitive words. Each field name must be surrounded by quotation marks. All field names must be separated with commas. The entire list of fields (even if only one field is listed) must be surrounded with square brackets [].
- If a
status field
value was specified for one or more services above, provide the value of the field that indicates that the reports should be scanned in thestatus_value
parameter. For example, in the Citizen Problem Reporter workflows, you may want to scan only reports that have the valueSubmitted
in the STATUS field. Fields that have another status have been manually reviewed and processed. - At the
visible_value
parameter, provide the value of the field specified in theflag field
parameter that indicates that the report is currently visible and therefore should be scanned. Provide the value that will hide the report in thehidden_value
parameter. - The script will search for three types of content before deciding to update the
flag field
of a report to thehidden_value
. These searches are not case-sensitive, and will include common letter substitutions (listed below).- Explicit words that match exactly. These words should be listed at the
bad_words_exact
parameter. The script will only search for these combinations of letters where they exist surrounded by at least one space. This list should be used for explicit words that may exist as harmless components of other words. For example, if the word duck is included in this list, the word duckling would not be identified as explicit. - Explicit combinations of letters that should always be hidden regardless of their context. These words should be listed at the
bad_words
parameter. If the word duck is included in this list, duckling would also be considered explicit. - Words that indicate that the report might contain sensitive content that should be reviewed before deciding to make the report public. These words should be listed at the
sensitive_words
parameter. For example, nest sites are often considered areas that should not be advertised to the general public, so including the word nest in this list would ensure that reports including this word could undergo further review.
- Explicit words that match exactly. These words should be listed at the
- Some people will go to creative lengths to get content past moderation filters. The script includes a list of common character substitutions in the
subs
parameter, but this parameter can be updated if it is causing acceptable words to be caught or if unacceptable words are getting through. - Save your changes to this configuration file and test the script by double-clicking the find_words.py file in Windows Explorer. The script will report any errors in a log file that will be created in the same directory as the scripts. Also verify that the expected changes occurred in the service.
Set up Task Scheduler
Use Windows Task Scheduler to schedule the script to regularly scan the configured services and moderate content.
- Open the Task Scheduler on the desktop computer that is hosting the scripts.
- Click Action > Create Task and name your task.
- Click the Action tab and click New.
- Set Action to Start a Program.
- Browse to the location of your Python installation (for example, <default directory>\Python27\ArcGIS10.5\python.exe).
- In the Add arguments text box, type the name of the script (find_words.py).
- In the Start in text box, type the path to the folder where your script is and click OK.
- Click the Trigger tab, click New, and set a schedule for your task.
- Click OK. When the trigger occurs, the scripts will begin scanning the specified layers.