SkillAgentSearch skills...

Reminiscence

Self-Hosted Bookmark And Archive Manager

Install / Use

/learn @kanishka-linux/Reminiscence
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Reminiscence

Self-hosted Bookmark and Archive manager

Table of Contents

Features

  • Bookmark links and edit its metadata (like title, tags, summary) via web-interface.

  • Archive links content in HTML, PDF or full-page PNG format.

  • Automatic archival of links to non-html content like pdf, jpg, txt etc..

    i.e. Bookmarking links to pdf, jpg etc.. via web-interface will automatically save those files on server.

  • Supports archival of media elements of a web-page using third party download managers.

  • Directory based categorization of bookmarks

  • Automatic tagging of HTML links.

  • Automatic summarization of HTML content.

  • Special readability mode.

  • Search bookmarks according to url, title, tags or summary.

  • Supports multiple user accounts.

  • Supports public and group directory for every user.

  • Upload any file from web-interface for archiving.

  • Easy to use admin interface for managing multiple users.

  • Import bookmarks from Netscape Bookmark HTML file format.

  • Supports streaming of archived media elements.

  • Annotation support for both HTML, its readable version.

  • Annotation support for both archived and uploaded pdf/epub files.

  • Remembers last read position of html (and its readable version), pdf and epub.

  • Rudimentary support for adding custom note.

Installation

  1. First make sure that python 3.9+ (recommended version is 3.10+) is installed on system and install following packages using native package manager.

     1. virtualenv
    
     2. ~wkhtmltopdf (for html to pdf/png conversion)~ deprecated from v4.0+ due to security vulnerability.
    
         * [hlspy](https://github.com/kanishka-linux/hlspy) is now default headless browser which is based on QTWebEngine.
    
     3. hlspy (mandatory from v4.0+)
    
     4. redis-server
    
     5. chromium (optional from v0.2+)
    
     6. PyQt5
    
     7. PyQtWebEngine
    
  2. Installation of above dependencies in Arch or Arch based distros

     $ sudo pacman -S python-virtualenv redis chromium python-pyqt5 qt5-webengine python-pyqtwebengine
    
  3. Installation of above dependencies in Debian or Ubuntu based distros

     $ sudo apt install virtualenv redis-server chromium-browser python3-pyqt5 python3-pyqt5.qtwebengine
    
  4. Install hlspy

     $ sudo pip3 install git+https://github.com/kanishka-linux/hlspy
    

Note: Name of above dependencies may change depending on distro or OS, so install accordingly. Once above dependencies are installed, execute following commands, which are distro/platform independent.

Now execute following commands in terminal.

$ mkdir reminiscence

$ cd reminiscence

$ virtualenv -p python3 venv

$ python3 -m venv venv (for python3.10+)

$ source venv/bin/activate

$ cd venv

$ git clone https://github.com/kanishka-linux/reminiscence.git

$ cd reminiscence

$ source hlspy.env

$ pip install -r requirements.txt

$ mkdir logs archive tmp

$ python manage.py generatesecretkey

$ python manage.py nltkdownload

$ python manage.py migrate

$ python manage.py createsuperuser

$ python manage.py runserver 0.0.0.0:8000

open 0.0.0.0:8000 using any browser, login and start adding links

**Note:** replace localhost address with local ip address of your server
        
          to access web-interface from anywhere on the local network

Admin interface available at: /admin/
          

Setting up Celery (mandatory from v0.4 onwards):

  1. Generating PDFs and PNGs are resource intesive and time consuming. We can delegate these tasks to celery, in order to execute them in the background.

     Edit reminiscence/settings.py file and set `USE_CELERY = True`
    
  2. Now open another terminal in the same topmost project directory and execute following commands:

     $ sudo systemctl start redis-server
    
     $ cd venv
    
     $ source bin/activate
    
     $ cd venv/reminiscence
    
     $ source hlspy.env
    
     $ celery -A reminiscence worker --loglevel=info -c 4 --detach
    

Using Docker

Note: Following procedure may not work exactly from v4.0+. The dockerfiles have been updated but it is possible that users may still face some issues, so they are advised to make changes in respective Dockerfile or docker-compose as required.

Using docker is convenient compared to normal installation method described above. It will take care of configuration and setting up of gunicorn, nginx and also postgresql database along with redis and worker. (Setting and running up these three things can be a bit cumbersome, if done manually, which is described below in separate section.) It will also automatically download headless browser hlspy and nltk data set, apart from installing python based dependencies.

Note: from v4.0+, wkhtmltopdf is replaced with hlspy. Users are advised to migrate to v4.0 due to security vulnerability in wkhtmltopdf. If users are finding it difficult to migrate then they should atleast disable automatic pdf/png generation of a web-page for older reminiscence version and use chromium instead manually for pdf generation.

  1. Install docker and docker-compose

  2. Enable/start docker service. Instructions for enabling docker might be different in different distros. Sample instruction for enabling/starting docker will look like

     $ systemctl enable/start docker.service
    
  3. clone github repository and enter directory

     $ git clone https://github.com/kanishka-linux/reminiscence.git
    
     $ cd reminiscence
    
  4. build and start

     $ sudo docker-compose up --build
    
     Note: Above instruction will take some time when executed for the first time.
    
  5. Above step will also create default user: 'admin' with default password: 'changepassword'

  6. If IP address of server is '192.168.1.2' then admin interface will be available at

     192.168.1.2/admin/
    
     Note: In this method, there is no need to
           attach port number to IP address.
    
  7. Change default admin password from admin interface and create new regular user. After that logout, and open '192.168.1.2'. Now login with regular user for regular activity.

  8. For custom configuration, modify nginx.conf and dockerfiles available in the repository. After that execute step 4 again.

Note: If Windows users are facing problem in mounting data volume for Postgres, they are advised to refer this issue.

Note: Ubuntu 16.04 users might have to modify docker-compose.yml file and need to change version 3 to 2. issue

Note: For setting celery inside docker follow these instruction. Sometimes gunicorn doesn't work properly with default background task handler inside docker. In such cases users can enable celery.

Documentation

Adding Directories And Links

  • Creating Directory

    Users first have to create directory from web interface.

    Note: Currently '/' and few other special characters are not allowed as characters in directory name. If users are facing problem when accessing directory, then they are advised to rename directory and remove special characters.

    reminiscence

  • Adding Links

    Users have to navigate to required directory and then need to add links to it. URLs are fetched asynchronously from the source for gathering metadata initially. Users have to wait for few seconds, after that page will refresh automatically showing new content. It may happen, nothing would show up after automatic page refresh (e.g. due to slow URL fetching) then try refreshing page manually by clicking on directory entry again. Maybe in future, I will have to look into django channels and websockets to enable real-time duplex communication between client and server.

    reminiscence

Automatic Tagging and Summarization

This feature has been implemented using NLTK library. The library has been used for proper tokenization and removing stopwords from sentence. Once stopwords are removed, top K high frequency words (where value of K is decided by user) are used as tags. In order to generate summary of HTML content, score is alloted to a sentence based on frequency of non-stopwords contained in it. After that highests s

View on GitHub
GitHub Stars1.8k
CategoryDevelopment
Updated8d ago
Forks85

Languages

JavaScript

Security Score

100/100

Audited on Mar 24, 2026

No findings