SearchIt: A search engine built with Glitch

So, I built a search engine. It’s real. It crawls the web and fetches metadata about a page and stores it, allowing you to search for it later. No APIs, just Glitch. it uses your browser to search through the data as using Glitch would be memory/CPU intensive.

I’m using some package (forgot name) and metadata does not only have to from from Metatags. For example, https://example.com does not have any metatags, but the engine is still able to get metadata from it.

Try it out at: https://searchit.glitch.me/
You can submit your site at: https://searchit.glitch.me/submit. Wait about 10 seconds and try searching for it.

The search engine skips webpages without a title.

This is the search snippet I used, but made a bunch of modifications.

8 Likes

Well done! What about allowing all sites to be crawled?

2 Likes

Cool! Is this boosted somehow?

Please add a filter to the submit page. I was easily able to add “unprofessional” website to the DB.

What do you mean?

I am not going to go too deep in, but there are some, questionable sites that have already been added.

So FurAffinity?

No, I mean like 18+.

Oh… I see…

Website crashed…

this is my school pc btw

aboutDavid said it was under maintenance apparently

2 Likes

So I added a filter, so I had to wipe the database.

Hopefully searches will be shorter as I added caching and database compression

1 Like

18+ content?

Can you add this to github? I’d like to add a flag feature to remove inappropriate urls.

2 Likes

If you submit a URL, please add https:// or http:// to the beginning of it. Otherwise, the engine won’t fetch it.

If your site shows up and it has no description, please add metatags to your site. You can generate them easily here.

Poll:

Should I restrict SearchIt to only *.glitch.me sites?
  • Yes
  • No

0 voters

Fun fact: going to https://searchit.glitch.me/search?q= shows all the websites.

3 Likes

Yeah, I wanted it to be something like a directory index.

Should I make this open source?
  • Yes
  • No

0 voters

heck yes make it open source

I would love to help out! Would you consider using SQL over JSON to prevent the risk of corruption?

1 Like

Jsoning to the resque aaaaaah

Personally, don’t use SQL, it’s as old as PHP. :grimacing: :no_mouth:

Use NoSQL databases, like Firebase or Deta Base or IndexedDB. Or if you’re still persistent on using SQL, use a database adapter like https://endb.js.org to make your life easier.

1 Like

SQL is outdated? Most companies use some sort of SQL Its proven to do the job unlike JSON which is prone to corruption,

The reason one would not use something like jsoning is because it’s not efficent enough for them.
Jsoning may or may not load all it’s data into memory which is not optimal. The real inefficiency may come when there is a larger amount of keys so in the worst case it will take O(N), where N is the total number of keys. SQL and NoSQL databases have a ton of optimizations and complex data structures which is why most bussiness use them

1 Like

Old isn’t really a great excuse. Python is older than PHP!

2 Likes

We often build on top of old things as xkcd says

6 Likes

Wait, JSON? Are you talking about a JSON file or a JSON DB?

yep, json can get corrupted easy

Any DB can get corrupted. The reason I switched to MongoDB was because it had free managed hosting by the people who made it. Where as in MySQL, you are at the mercy of yourself or an unreliable host. So far the only free hosting I have found is RemoteMySQL (You can only use 100 mb of data and you don’t get access to a full database, rather just 1 db on a server that is shared with other people.) and 000webhost.

1 Like

Hosting mysql on glitch works fine.

I know a service called pythonanywhere.com that seems to have ok hosting. I’ve also heard that you can get free postgresql hosting from heroku.

PHP is actually good now. SQL is a good, solid, useful language, and it has many stable, refined, implementations. It is mostly portable, which you can’t claim for most nosql databases.

3 Likes

Depending on what you are storing, 100mbs of sql storage isn’t bad. My biggest database is ~22,000 rows and it only takes up ~2mb.

5 Likes

Yeah, for my sql apps, I can start small from sqlite and move to mysql, postgresql, mssql if needed to scale up
You only have the mongodb server/clusters which in most cases, cannot run on glitch

Just add a report button and a blacklist array of inappropriate websites. Which I won’t mention.

1 Like

Talking about the host.

The filter doesn’t work - https://searchit.glitch.me/search?q= shows adult content.
Edit: i think you should scan the source of the site for bad words and if it appears don’t add it.

1 Like

It’s a bit obvious who is doing it (no not you) and I have screenshots to support, not prove the idea of who did it.

It really doesn’t matter who is doing it, you should have a filter in place to flag or remove 18+ websites.

1 Like

I do, people bypass it.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.