Mercurial > kallithea
annotate rhodecode/tests/test_crawler.py @ 1336:e9fe4ff57cbb beta
Do a redirect to login for anonymous users
author | Marcin Kuzminski <marcin@python-works.com> |
---|---|
date | Sun, 15 May 2011 13:49:14 +0200 |
parents | 08cd02374883 |
children | bbfc3f305c6b |
rev | line source |
---|---|
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
1 # -*- coding: utf-8 -*- |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
2 """ |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
3 rhodecode.tests.test_crawer |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
4 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
5 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
6 Test for crawling a project for memory usage |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
7 |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
8 watch -n1 ./rhodecode/tests/mem_watch |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
9 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
10 :created_on: Apr 21, 2010 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
11 :author: marcink |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
12 :copyright: (C) 2009-2011 Marcin Kuzminski <marcin@python-works.com> |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
13 :license: GPLv3, see COPYING for more details. |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
14 """ |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
15 # This program is free software: you can redistribute it and/or modify |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
16 # it under the terms of the GNU General Public License as published by |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
17 # the Free Software Foundation, either version 3 of the License, or |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
18 # (at your option) any later version. |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
19 # |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
20 # This program is distributed in the hope that it will be useful, |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
21 # but WITHOUT ANY WARRANTY; without even the implied warranty of |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
22 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
23 # GNU General Public License for more details. |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
24 # |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
25 # You should have received a copy of the GNU General Public License |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
26 # along with this program. If not, see <http://www.gnu.org/licenses/>. |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
27 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
28 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
29 import cookielib |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
30 import urllib |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
31 import urllib2 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
32 import vcs |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
33 import time |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
34 |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
35 from os.path import join as jn |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
36 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
37 |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
38 BASE_URI = 'http://127.0.0.1:5000/%s' |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
39 PROJECT = 'CPython' |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
40 PROJECT_PATH = jn('/', 'home', 'marcink', 'hg_repos') |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
41 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
42 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
43 cj = cookielib.FileCookieJar('/tmp/rc_test_cookie.txt') |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
44 o = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
45 o.addheaders = [ |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
46 ('User-agent', 'rhodecode-crawler'), |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
47 ('Accept-Language', 'en - us, en;q = 0.5') |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
48 ] |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
49 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
50 urllib2.install_opener(o) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
51 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
52 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
53 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
54 def test_changelog_walk(pages=100): |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
55 total_time = 0 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
56 for i in range(1, pages): |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
57 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
58 page = '/'.join((PROJECT, 'changelog',)) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
59 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
60 full_uri = (BASE_URI % page) + '?' + urllib.urlencode({'page':i}) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
61 s = time.time() |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
62 f = o.open(full_uri) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
63 size = len(f.read()) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
64 e = time.time() - s |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
65 total_time += e |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
66 print 'visited %s size:%s req:%s ms' % (full_uri, size, e) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
67 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
68 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
69 print 'total_time', total_time |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
70 print 'average on req', total_time / float(pages) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
71 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
72 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
73 def test_changeset_walk(): |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
74 print jn(PROJECT_PATH, PROJECT) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
75 total_time = 0 |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
76 |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
77 repo = vcs.get_repo(jn(PROJECT_PATH, PROJECT)) |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
78 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
79 for i in repo: |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
80 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
81 raw_cs = '/'.join((PROJECT, 'changeset', i.raw_id)) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
82 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
83 full_uri = (BASE_URI % raw_cs) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
84 s = time.time() |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
85 f = o.open(full_uri) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
86 size = len(f.read()) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
87 e = time.time() - s |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
88 total_time += e |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
89 print 'visited %s\%s size:%s req:%s ms' % (full_uri, i, size, e) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
90 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
91 print 'total_time', total_time |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
92 print 'average on req', total_time / float(len(repo)) |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
93 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
94 def test_files_walk(): |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
95 print jn(PROJECT_PATH, PROJECT) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
96 total_time = 0 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
97 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
98 repo = vcs.get_repo(jn(PROJECT_PATH, PROJECT)) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
99 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
100 paths_ = set() |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
101 try: |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
102 tip = repo.get_changeset('tip') |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
103 for topnode, dirs, files in tip.walk('/'): |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
104 for f in files: |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
105 paths_.add(f.path) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
106 for dir in dirs: |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
107 for f in files: |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
108 paths_.add(f.path) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
109 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
110 except vcs.exception.RepositoryError, e: |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
111 pass |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
112 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
113 for f in paths_: |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
114 file_path = '/'.join((PROJECT, 'files', 'tip', f)) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
115 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
116 full_uri = (BASE_URI % file_path) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
117 s = time.time() |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
118 f = o.open(full_uri) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
119 size = len(f.read()) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
120 e = time.time() - s |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
121 total_time += e |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
122 print 'visited %s size:%s req:%s ms' % (full_uri, size, e) |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
123 |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
124 print 'total_time', total_time |
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
125 print 'average on req', total_time / float(len(repo)) |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
126 |
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
127 |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
128 #test_changelog_walk() |
1332
3fdfecc52c32
added test for crawling and memory usage
Marcin Kuzminski <marcin@python-works.com>
parents:
diff
changeset
|
129 #test_changeset_walk() |
1334
08cd02374883
Added mem_watch script. Test can also walk on file tree. Fixed some path issues
Marcin Kuzminski <marcin@python-works.com>
parents:
1332
diff
changeset
|
130 test_files_walk() |