templates: properly escape inline JavaScript values TLDR: Kallithea has issues with escaping values for use in inline JS. Despite judicious poking of the code, no actual security vulnerabilities have been found, just lots of corner-case bugs. This patch fixes those, and hardens the code against actual security issues. The long version: To embed a Python value (typically a 'unicode' plain-text value) in a larger file, it must be escaped in a context specific manner. Example: >>> s = u'<script>alert("It\'s a trap!");</script>' 1) Escaped for insertion into HTML element context >>> print cgi.escape(s) &lt;script&gt;alert("It's a trap!");&lt;/script&gt; 2) Escaped for insertion into HTML element or attribute context >>> print h.escape(s) &lt;script&gt;alert(&#34;It&#39;s a trap!&#34;);&lt;/script&gt; This is the default Mako escaping, as usually used by Kallithea. 3) Encoded as JSON >>> print json.dumps(s) "<script>alert(\"It's a trap!\");</script>" 4) Escaped for insertion into a JavaScript file >>> print '(' + json.dumps(s) + ')' ("<script>alert(\"It's a trap!\");</script>") The parentheses are not actually required for strings, but may be needed to avoid syntax errors if the value is a number or dict (object). 5) Escaped for insertion into a HTML inline <script> element >>> print h.js(s) ("\x3cscript\x3ealert(\"It's a trap!\");\x3c/script\x3e") Here, we need to combine JS and HTML escaping, further complicated by the fact that "<script>" tag contents can either be parsed in XHTML mode (in which case '<', '>' and '&' must additionally be XML escaped) or HTML mode (in which case '</script>' must be escaped, but not using HTML escaping, which is not available in HTML "<script>" tags). Therefore, the XML special characters (which can only occur in string literals) are escaped using JavaScript string literal escape sequences. (This, incidentally, is why modern web security best practices ban all use of inline JavaScript...) Unsurprisingly, Kallithea does not do (5) correctly. In most cases, Kallithea might slap a pair of single quotes around the HTML escaped Python value. A typical benign example: $('#child_link').html('${_('No revisions')}'); This works in English, but if a localized version of the string contains an apostrophe, the result will be broken JavaScript. In the more severe cases, where the text is user controllable, it leaves the door open to injections. In this example, the script inserts the string as HTML, so Mako's implicit HTML escaping makes sense; but in many other cases, HTML escaping is actually an error, because the value is not used by the script in an HTML context. The good news is that the HTML escaping thwarts attempts at XSS, since it's impossible to inject syntactically valid JavaScript of any useful complexity. It does allow JavaScript errors and gibberish to appear on the page, though. In these cases, the escaping has been fixed to use either the new 'h.js' helper, which does JavaScript escaping (but not HTML escaping), OR the new 'h.jshtml' helper (which does both), in those cases where it was unclear if the value might be used (by the script) in an HTML context. Some of these can probably be "relaxed" from h.jshtml to h.js later, but for now, using h.jshtml fixes escaping and doesn't introduce new errors. In a few places, Kallithea JSON encodes values in the controller, then inserts the JSON (without any further escaping) into <script> tags. This is also wrong, and carries actual risk of XSS vulnerabilities. However, in all cases, security vulnerabilities were narrowly avoided due to other filtering in Kallithea. (E.g. many special characters are banned from appearing in usernames.) In these cases, the escaping has been fixed and moved to the template, making it immediately visible that proper escaping has been performed. Mini-FAQ (frequently anticipated questions): Q: Why do everything in one big, hard to review patch? Q: Why add escaping in specific case FOO, it doesn't seem needed? Because the goal here is to have "escape everywhere" as the default policy, rather than identifying individual bugs and fixing them one by one by adding escaping where needed. As such, this patch surely introduces a lot of needless escaping. This is no different from how Mako/Pylons HTML escape everything by default, even when not needed: it's errs on the side of needless work, to prevent erring on the side of skipping required (and security critical) work. As for reviewability, the most important thing to notice is not where escaping has been introduced, but any places where it might have been missed (or where h.jshtml is needed, but h.js is used). Q: The added escaping is kinda verbose/ugly. That is not a question, but yes, I agree. Hopefully it'll encourage us to move away from inline JavaScript altogether. That's a significantly larger job, though; with luck this patch will keep us safe and secure until such a time as we can implement the real fix. Q: Why not use Mako filter syntax ("${val|h.js}")? Because of long-standing Mako bug #140, preventing use of 'h' in filters. Q: Why not work around bug #140, or even use straight "${val|js}"? Because Mako still applies the default h.escape filter before the explicitly specified filters. Q: Where do we go from here? Longer term, we should stop doing variable expansions in script blocks, and instead pass data to JS via e.g. data attributes, or asynchronously using AJAX calls. Once we've done that, we can remove inline JavaScript altogether in favor of separate script files, and set a strict Content Security Policy explicitly blocking inline scripting, and thus also the most common kind of cross-site scripting attack.
author Søren Løvborg <>
date Tue, 28 Feb 2017 17:19:00 +0100
parents d89d586b26ae
children 2c3d30095d5e
line wrap: on
line source

# Kallithea - Development config:                                              #
# listening on *:5000                                                          #
# sqlite and kallithea.db                                                      #
# initial_repo_scan = true                                                     #
# set debug = true                                                             #
# verbose and colorful logging                                                 #
#                                                                              #
# The %(here)s variable will be replaced with the parent directory of this file#

debug = true
pdebug = false

## Email settings                                                             ##
##                                                                            ##
## Refer to the documentation ("Email settings") for more details.            ##
##                                                                            ##
## It is recommended to use a valid sender address that passes access         ##
## validation and spam filtering in mail servers.                             ##

## 'From' header for application emails. You can optionally add a name.
## Default:
#app_email_from = Kallithea
## Examples:
#app_email_from = Kallithea <>
#app_email_from =

## Subject prefix for application emails.
## A space between this prefix and the real subject is automatically added.
## Default:
#email_prefix =
## Example:
#email_prefix = [Kallithea]

## Recipients for error emails and fallback recipients of application mails.
## Multiple addresses can be specified, space-separated.
## Only addresses are allowed, do not add any name part.
## Default:
#email_to =
## Examples:
#email_to =
#email_to =

## 'From' header for error emails. You can optionally add a name.
## Default:
#error_email_from =
## Examples:
#error_email_from = Kallithea Errors <>
#error_email_from =

## SMTP server settings
## If specifying credentials, make sure to use secure connections.
## Default: Send unencrypted unauthenticated mails to the specified smtp_server.
## For "SSL", use smtp_use_ssl = true and smtp_port = 465.
## For "STARTTLS", use smtp_use_tls = true and smtp_port = 587.
#smtp_server =
#smtp_username =
#smtp_password =
#smtp_port = 25
#smtp_use_ssl = false
#smtp_use_tls = false

## PASTE ##
#use = egg:Paste#http
## nr of worker threads to spawn
#threadpool_workers = 1
## max request before thread respawn
#threadpool_max_requests = 100
## option to use threads of process
#use_threadpool = true

use = egg:waitress#main
## number of worker threads
threads = 1
max_request_body_size = 107374182400
## use poll instead of select, fixes fd limits, may not work on old
## windows systems.
#asyncore_use_poll = True

#use = egg:gunicorn#main
## number of process workers. You must set `instance_id = *` when this option
## is set to more than one worker
#workers = 1
## process name
#proc_name = kallithea
## type of worker class, one of sync, eventlet, gevent, tornado
## recommended for bigger setup is using of of other than sync one
#worker_class = sync
#max_requests = 1000
## amount of time a worker can handle request before it gets killed and
## restarted
#timeout = 3600

## UWSGI ##
## run with uwsgi --ini-paste-logged <inifile.ini>
#socket = /tmp/uwsgi.sock
#master = true
#http =

## set as deamon and redirect all output to file
#daemonize = ./uwsgi_kallithea.log

## master process PID
#pidfile = ./

## stats server with workers statistics, use uwsgitop
## for monitoring, `uwsgitop`
#stats =
#memory-report = true

## log 5XX errors
#log-5xx = true

## Set the socket listen queue size.
#listen = 256

## Gracefully Reload workers after the specified amount of managed requests
## (avoid memory leaks).
#max-requests = 1000

## enable large buffers
#buffer-size = 65535

## socket and http timeouts ##
#http-timeout = 3600
#socket-timeout = 3600

## Log requests slower than the specified number of milliseconds.
#log-slow = 10

## Exit if no app can be loaded.
#need-app = true

## Set lazy mode (load apps in workers instead of master).
#lazy = true

## scaling ##
## set cheaper algorithm to use, if not set default will be used
#cheaper-algo = spare

## minimum number of workers to keep at all times
#cheaper = 1

## number of workers to spawn at startup
#cheaper-initial = 1

## maximum number of workers that can be spawned
#workers = 4

## how many workers should be spawned at a time
#cheaper-step = 1

## COMMON ##
#host =
host =
port = 5000

## middleware for hosting the WSGI application under a URL prefix
#use = egg:PasteDeploy#prefix
#prefix = /<your-prefix>

use = egg:kallithea
## enable proxy prefix middleware
#filter-with = proxy-prefix

full_stack = true
static_files = true
## Available Languages:
## cs de fr hu ja nl_BE pl pt_BR ru sk zh_CN zh_TW
lang =
cache_dir = %(here)s/data
index_dir = %(here)s/data/index

## perform a full repository scan on each server start, this should be
## set to false after first startup, to allow faster server restarts.
#initial_repo_scan = false
initial_repo_scan = true

## uncomment and set this path to use archive download cache
archive_cache_dir = %(here)s/tarballcache

## change this to unique ID for security
app_instance_uuid = development-not-secret

## cut off limit for large diffs (size in bytes)
cut_off_limit = 256000

## force https in Kallithea, fixes https redirects, assumes it's always https
force_https = false

## use Strict-Transport-Security headers
use_htsts = false

## number of commits stats will parse on each iteration
commit_parse_limit = 25

## path to git executable
git_path = git

## git rev filter option, --all is the default filter, if you need to
## hide all refs in changelog switch this to --branches --tags
#git_rev_filter = --branches --tags

## RSS feed options
rss_cut_off_limit = 256000
rss_items_per_page = 10
rss_include_diff = false

## options for showing and identifying changesets
show_sha_length = 12
show_revision_number = false

## Canonical URL to use when creating full URLs in UI and texts.
## Useful when the site is available under different names or protocols.
## Defaults to what is provided in the WSGI environment.
#canonical_url =

## gist URL alias, used to create nicer urls for gist. This should be an
## url that does rewrites to _admin/gists/<gistid>.
## example:{gistid}. Empty means use the internal
## Kallithea url, ie. http[s]://<gistid>
gist_alias_url =

## white list of API enabled controllers. This allows to add list of
## controllers to which access will be enabled by api_key. eg: to enable
## api access to raw_files put `FilesController:raw`, to enable access to patches
## add `ChangesetController:changeset_patch`. This list should be "," separated
## Syntax is <ControllerClass>:<function>. Check debug logs for generated names
## Recommended settings below are commented out:
api_access_controllers_whitelist =
#    ChangesetController:changeset_patch,
#    ChangesetController:changeset_raw,
#    FilesController:raw,
#    FilesController:archivefile

## default encoding used to convert from and to unicode
## can be also a comma separated list of encoding in case of mixed encodings
default_encoding = utf8

## issue tracker for Kallithea (leave blank to disable, absent for default)
#bugtracker =

## issue tracking mapping for commits messages
## comment out issue_pat, issue_server, issue_prefix to enable

## pattern to get the issues from commit messages
## default one used here is #<numbers> with a regex passive group for `#`
## {id} will be all groups matched from this pattern

issue_pat = (?:\s*#)(\d+)

## server url to the issue, each {id} will be replaced with match
## fetched from the regex and {repo} is replaced with full repository name
## including groups {repo_name} is replaced with just name of repo

issue_server_link ={repo}/issue/{id}

## prefix to add to link to indicate it's an url
## #314 will be replaced by <issue_prefix><id>

issue_prefix = #

## issue_pat, issue_server_link, issue_prefix can have suffixes to specify
## multiple patterns, to other issues server, wiki or others
## below an example how to create a wiki pattern
# wiki-some-id ->

#issue_pat_wiki = (?:wiki-)(.+)
#issue_server_link_wiki ={id}
#issue_prefix_wiki = WIKI-

## alternative return HTTP header for failed authentication. Default HTTP
## response is 401 HTTPUnauthorized. Currently Mercurial clients have trouble with
## handling that. Set this variable to 403 to return HTTPForbidden
auth_ret_code =

## locking return code. When repository is locked return this HTTP code. 2XX
## codes don't break the transactions while 4XX codes do
lock_ret_code = 423

## allows to change the repository location in settings page
allow_repo_location_change = True

## allows to setup custom hooks in settings page
allow_custom_hooks_settings = True

## extra extensions for indexing, space separated and without the leading '.'.
# index.extensions =
#    gemfile
#    lock

## extra filenames for indexing, space separated
# index.filenames =
#    .dockerignore
#    .editorconfig

###        CELERY CONFIG        ####

use_celery = false

## Example: connect to the virtual host 'rabbitmqhost' on localhost as rabbitmq:
broker.url = amqp://rabbitmq:qewqew@localhost:5672/rabbitmqhost

celery.imports = kallithea.lib.celerylib.tasks
celery.accept.content = pickle
celery.result.backend = amqp
celery.result.dburi = amqp://
celery.result.serialier = json

#celery.send.task.error.emails = true
#celery.amqp.task.result.expires = 18000

celeryd.concurrency = 2
celeryd.max.tasks.per.child = 1

## If true, tasks will never be sent to the queue, but executed locally instead.
celery.always.eager = false

###         BEAKER CACHE        ####

beaker.cache.data_dir = %(here)s/data/cache/data
beaker.cache.lock_dir = %(here)s/data/cache/lock

beaker.cache.regions = short_term,long_term,sql_cache_short

beaker.cache.short_term.type = memory
beaker.cache.short_term.expire = 60
beaker.cache.short_term.key_length = 256

beaker.cache.long_term.type = memory
beaker.cache.long_term.expire = 36000
beaker.cache.long_term.key_length = 256

beaker.cache.sql_cache_short.type = memory
beaker.cache.sql_cache_short.expire = 10
beaker.cache.sql_cache_short.key_length = 256

###       BEAKER SESSION        ####

## Name of session cookie. Should be unique for a given host and path, even when running
## on different ports. Otherwise, cookie sessions will be shared and messed up.
beaker.session.key = kallithea
## Sessions should always only be accessible by the browser, not directly by JavaScript.
beaker.session.httponly = true
## Session lifetime. 2592000 seconds is 30 days.
beaker.session.timeout = 2592000

## Server secret used with HMAC to ensure integrity of cookies.
beaker.session.secret = development-not-secret
## Further, encrypt the data with AES.
#beaker.session.encrypt_key = <key_for_encryption>
#beaker.session.validate_key = <validation_key>

## Type of storage used for the session, current types are
## dbm, file, memcached, database, and memory.

## File system storage of session data. (default)
#beaker.session.type = file

## Cookie only, store all session data inside the cookie. Requires secure secrets.
#beaker.session.type = cookie

## Database storage of session data.
#beaker.session.type = ext:database = postgresql://postgres:qwe@localhost/kallithea
#beaker.session.table_name = db_session


### [appenlight] ###

## AppEnlight is tailored to work with Kallithea, see
## for details how to obtain an account
## you must install python package `appenlight_client` to make it work

## appenlight enabled
appenlight = false

appenlight.server_url =
appenlight.api_key = YOUR_API_KEY


## enables 404 error logging (default False)
appenlight.report_404 = false

## time in seconds after request is considered being slow (default 1)
appenlight.slow_request_time = 1

## record slow requests in application
## (needs to be enabled for slow datastore recording and time tracking)
appenlight.slow_requests = true

## enable hooking to application loggers
#appenlight.logging = true

## minimum log level for log capture
#appenlight.logging.level = WARNING

## send logs only from erroneous/slow requests
## (saves API quota for intensive logging)
appenlight.logging_on_error = false

## list of additional keywords that should be grabbed from environ object
## can be string with comma separated list of words in lowercase
## (by default client will always send following info:
## start with HTTP* this list be extended with additional keywords here
appenlight.environ_keys_whitelist =

## list of keywords that should be blanked from request object
## can be string with comma separated list of words in lowercase
## (by default client will always blank keys that contain following words
## 'password', 'passwd', 'pwd', 'auth_tkt', 'secret', 'csrf'
## this list be extended with additional keywords set here
appenlight.request_keys_blacklist =

## list of namespaces that should be ignores when gathering log entries
## can be string with comma separated list of namespaces
## (by default the client ignores own entries: appenlight_client.client)
appenlight.log_namespace_blacklist =

### [sentry] ###

## sentry is a alternative open source error aggregator
## you must install python packages `sentry` and `raven` to enable

sentry.dsn = YOUR_DNS
sentry.servers = =
sentry.key =
sentry.public_key =
sentry.secret_key =
sentry.project = =
sentry.include_paths =
sentry.exclude_paths =

## Debug mode will enable the interactive debugging tool, allowing ANYONE to  ##
## execute malicious code after an exception is raised.                       ##
#set debug = false
set debug = true

###       LOGVIEW CONFIG       ###

logview.sqlalchemy = #faa
logview.pylons.templating = #bfb
logview.pylons.util = #eee


# SQLITE [default]
sqlalchemy.url = sqlite:///%(here)s/kallithea.db?timeout=60

#sqlalchemy.url = postgresql://user:pass@localhost/kallithea

#sqlalchemy.url = mysql://user:pass@localhost/kallithea?charset=utf8

# see sqlalchemy docs for others

sqlalchemy.echo = false
sqlalchemy.pool_recycle = 3600


script_location = kallithea:alembic


keys = root, routes, kallithea, sqlalchemy, beaker, templates, whoosh_indexer

keys = console, console_sql

keys = generic, color_formatter, color_formatter_sql


level = NOTSET
handlers = console

level = DEBUG
handlers =
qualname = routes.middleware
## "level = DEBUG" logs the route matched and routing variables.
propagate = 1

level = DEBUG
handlers =
qualname = beaker.container
propagate = 1

level = INFO
handlers =
qualname = pylons.templating
propagate = 1

level = DEBUG
handlers =
qualname = kallithea
propagate = 1

level = INFO
handlers = console_sql
qualname = sqlalchemy.engine
propagate = 0

level = DEBUG
handlers =
qualname = whoosh_indexer
propagate = 1


class = StreamHandler
args = (sys.stderr,)
#level = INFO
level = DEBUG
#formatter = generic
formatter = color_formatter

class = StreamHandler
args = (sys.stderr,)
#level = WARN
level = DEBUG
#formatter = generic
formatter = color_formatter_sql


format = %(asctime)s.%(msecs)03d %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %Y-%m-%d %H:%M:%S

class = kallithea.lib.colored_formatter.ColorFormatter
format = %(asctime)s.%(msecs)03d %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %Y-%m-%d %H:%M:%S

class = kallithea.lib.colored_formatter.ColorFormatterSql
format = %(asctime)s.%(msecs)03d %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %Y-%m-%d %H:%M:%S