HN Search powered by Algolia

Search
Hacker News

Search by

WebScaleSQL | "We're Gonna Need A Bigger Database"(http://webscalesql.org/)

20 points|mikeevans|11 years ago|0 comments

WebScaleSQL "We're Gonna Need A Bigger Database"(http://webscalesql.org)

2 points|tbassetto|11 years ago|1 comments

Deciphering Glyph: A Bigger Database(https://blog.glyph.im/2025/03/a-bigger-database.html)

2 points|cratermoon|10 days ago|0 comments

A Bigger Database(https://blog.glyph.im/2025/03/a-bigger-database.html)

1 points|zdw|11 days ago|0 comments

Show HN: Investor Hunt is a database of 40k+ investors to raise your seed round

6 points|juhaszhenderson|7 years ago|6 comments

Hey guys!

Investor Hunt (https://www.investorhunt.co) is a categorized database of 40k+ angel investors and venture capitalists.

Figuring out which investors are likely to be interested in your project and how to get in direct contact with them is a nightmare. All this data is spread out, and finding it pushes you down a rabbit hole of AngelList, personal websites, Crunchbase, and VC fund portfolio pages. All this can take hundreds of hours of research––hours you could be using to actually close rounds.

For the last few years while we’ve been fundraising for different projects, we've built out spreadsheets of all the investors we’ve met with (and who we’d want to meet) complete with data like where to best contact them. Over the years, our private lists grew to over a thousand investors.

After building the data engineering tools to make Press Hunt (https://www.producthunt.com/posts/press-hunt) and Howler (https://www.howler.media/) possible, we realized it would make perfect sense to use them to grow our lists of investors into a comprehensive all-in-one database.

Investor Hunt is a simple SAAS product that aggregates the names, contact info, investment focuses, past investments, and locations of the best investors in the world in one place. We’ve grown the list to 40k+ investors (~30k of them have emails, and the rest have alternative contact links like Twitter, LinkedIn, and AngelList).

The database is still very much a WIP––we’re constantly enriching data and adding new sources. We’ve been using this ourselves (and have a few beta users), and have found it really useful, and it’s now 10x bigger than other investor database alternatives.

We also just launched on Product Hunt: https://www.producthunt.com/posts/investor-hunt.

We’d love feedback––thanks for the time!

- Matt, David, Ermek, & Rashid

Ask HN: Single Server with Filesystem Database?

2 points|deepstream|6 years ago|1 comments

I'm currently building a software-as-a-service app and using a novel architecture

I'm going to use the unix specifically Debian file system and user accounts as the users database. so basically whenever someone signs up I run a shell script that creates a new user add some to some relevant groups. the benefits are I got password checking baked in and I can save their data as simple files under the user's home directory.

I'll disable shell login as well as whenever I run a workload for the service application I can use the operating system to run that command as the actual user.

I just figure it's a simple and useful architecture because I get all the benefits of multi-user operating system baked in without having to think about any of that myself. I don't have to worry about a database.

and if my audience scales then I just move to a bigger instance. plus if I need to debug or support something I can just go into the user's home directory and change it easily myself.

also looking at the pricing of instances and the pricing of my application this will be affordable and in order to make this happy side business I don't need that many paying users.

I just figured this is such a good and simple architecture where it's all of the one machine and I can manage all myself but I'm also writing it in the cloud but I could also move it out if I want it. it gets a lot of flexibility and simplicity and I just think it's better than you know having a separate database and separate web server and so on.

this probably downsides to this so am I asking for ideas about that, so I can handle them before they surprise me.

I wrote a free book about TDD and clean architecture in Python

192 points|thedigicat|6 years ago|62 comments

Hey HN,

I just published on Leanpub a free book, "Clean Architectures in Python". It's a humble attempt to organise and expand some posts I published on my blog in the last years.

You can find it here: https://leanpub.com/clean-architectures-in-python

The main content is divided in two parts, this is a brief overview of the table of contents

* Part 1 - Tools
- Chapter 1 - Introduction to TDD
- Chapter 2 - On unit testing
- Chapter 3 - Mocks

* Part 2 - The clean architecture
- Chapter 1 - Components of a clean architecture
- Chapter 2 - A basic example
- Chapter 3 - Error management
- Chapter 4 - Database repositories

Some highlights:

- The book is written with beginners in mind

- It contains 3 full projects, two small ones to introduce TDD and mocks, a bigger one to describe the clean architecture approach

- Each project is explained step-by-step, and each step is linked to a tag in a companion repository on GitHub

The book is free, but if you want to contribute I will definitely appreciate the help.
My target is to encourage the discussion about software architectures, both in the Python community and outside it.

I hope you will enjoy the book! Please spread the news on your favourite social network

Launch HN: Charge Running (YC W21) – Social running app with real-time coaching

75 points|mknippen|4 years ago|60 comments

Hi HN,

I’m Matthew Knippen, a 10 year iOS dev turned CEO for Charge Running (www.chargerunning.com). We built a mobile app that allows you to run with others from all over the world in real-time, all while being trained by a certified run coach. Think of us as Peloton for running, but way more social. You can learn more about our product here: http://www.chargerunning.com

I’m a career mobile dev that always liked hacking on things, and worked on over 60 applications ranging from photos, games, and fitness. Over the years, when I come across a problem in my life, I write code to solve it. I built a garage door opener app before it was cool, an electronic Go board, and an app to track different whiskeys I’ve tasted like “Untappd for Spirits.” Charge was born out of a much bigger personal problem:

I used to run a fair bit with my friend (and now co-founder) Rory. It was a great way to stay in shape and having someone to talk helped the time go by faster. Unfortunately, when Rory moved across the country for the military, both of us ran significantly less than we did before. I came to the conclusion running by myself… SUCKED! We were chatting about it on the phone one day knowing there had to be a better solution, and that's when we thought of Charge. I spent the weekend hacking something together, and on Monday, we tried it out.

Our first version of the app was an all white screen where it showed two things, Rory’s distance, and mine. We hopped on a phone, and used the app to have a friendly competition. A programmer vs a Navy Seal is rarely a fair challenge, and he kicked my a$$, but we LOVED it! On the backend, we utilized Firebase’s Realtime database for data, and a group phone call to manage audio. (Most of this has since been upgraded)

As any developer would want to do, we kept building on it, showing things like current pace, cadence, and more. A few friends wanted to join us, so we built support for multiple users, and listened to music while we ran. It was at this time that someone joined us that was a friend of a friend, and said “Now that I know I can run with this, I never want to run without it.”

So, we decided to turn our ugly hacked together app into an official product. When talking to our users, we found out that they wanted four things during their runs:
1. Motivation - The hardest part about going for a run is committing to do it and getting those shoes on
2. A social experience - Every other running app focuses on social after the run, not the ability to run with others in real-time.
3. Education - Most beginner runners just start running. After hearing Rory talk with them, they learned proper form and how to improve without getting injured.
4. Music - When asked what a big pain point was, users said they needed to put more work into their playlist than they spent running!

We took that information, and made a small pivot into dedicated coached classes, where a certified trainer would guide you through a specific type of workout. Each workout was effort-based, meaning whether you’re a complete beginner, or have run 25 marathons, you could join any class and fit right in.

We hired coaches (finding them by doing a bit of web scraping ) and built an audio solution with professional DJ software, allowing the coaches to change the beat of the music and auto-blend them together.

Rory and our other co-founder Julie (my sister), would host a few classes a day. I quit my job as an iOS contractor for a big company to focus on a start-up full time, at the same time my wife was 8 months pregnant with our first child. (I have the world’s most supportive wife!)

We launched, and we’re instantly overwhelmed. Apple featured us on “New Apps We Love”. MacWorld called us the “App of the Week”. We got 25K downloads, but we're still in a very early beta. We had no on-boarding. No app store video. Calling it a website would be an exaggeration. No one knew what we did, or how it worked, and we churned 99% of the users within the first week.

However, the ones that stayed lit a fire inside of us that we never knew we had. We talked to them constantly, and they defined our product roadmap. Since then, we’ve had users run over 350,000 miles in live classes, hosted a wedding day ceremony for them, and have seen people become best friends who live halfway across the world from each other. That being said, we really want to learn more, and keep iterating. We would love to hear the community's feedback and answer any questions.

Ask HN: Demotivated Founder With Lost Passion Seeking Advice

35 points|markbao|17 years ago|26 comments

Posting this for a friend who is CEO of a startup, but wishes to remain anonymous

The past few weeks, I've been demotivated. Our product is done and we're ready to launch, but there's a bigger problem, I'm just not passionate anymore. The current cons:

a) My cofounder doesn't have it. He's not tech savy at all, doesn't contribute the way he should, and has turned everything into a two person bureaucracy. Just not a motivating person to work with. He also hasn't made any real sacrifices. I've left school, moved across the country, and a lot more for our startup.

b) It's been close to two years of work, and not a whole lot of momentum. We've bootstrapped the entire time, with a small amount from a customer. Bad part is, we don't have any money left for new development. I've handled a good portion of the tech issues, except for the core programming of the app: interface design, database design, sys admin/config, some bug fixing,etc. Think of it more as a CTO role, than a lead dev role. At the end of the day it's still 2 of us, that's a troubling sign. I'm a pretty motivating person, and the fact no one else has really shown interest in helping out or getting involved is scary.

c) I'm not overly familiar with how to run an enterprise/business software company. I've done consumer tech, this isn't my first venture, and I know enough (and yet have a shitload to learn). Consumer web companies seem much more straight forward to me. Business web apps are a different beast to launch.

d) this isn't a question of lack of traction, is it a good company, good product, etc. maybe this is the scary part: the product is awesome, the market were going after is a good one, and there's a lot of opportunity. the problem lies more with the co-founder and the overall energy/excitement of the company.

So my question is:

a) Do I bail out and start something else/ join something else that is in its really early stages?

b) Is there a way to get motivated again/fix these problems?

Many thanks in advance.

Request HN: An open source list of disallowed usernames

21 points|neya|12 years ago|18 comments

Hey HN, I would like to compile a list of disallowed usernames that will be open-source. I'm sure most of you use a pre-compiled list of usernames that you don't want people to signup with for your startup. I'm trying to build a bigger, useful and open source version of the same list.

For example, you wouldn't want someone signing up with 'support' or 'admin' as username, right?
That would mean they would be able to generate a URL like http://example.com/support or http://support.example.com.

I'm planning on consolidating this list into one single huge database that all startups can benefit from. Hence, as a humble request, I kindly ask you to donate me your list so we can put everything in one place, together!

You can contribute via Github:
https://github.com/dsignr/Disallowed-usernames

Let's get this list up so everyone of us can benefit from it. Thanks!

Neya

Ask HN: What do you like to see in a web app quickstarter?

14 points|jelmerdejong|8 years ago|3 comments

Hi there, I've noticed that when I work on a new idea, creating a MVP or prototype, I tend to spend way to much time on setting up a new project, building out things that are important, but far from urgent to get an idea validated. To save myself time, I created a basic web app starter project (call it a blueprint / bootstrap). See: https://github.com/jelmerdejong/flask-app-blueprint

Currently it sets up a basic Flask project, on PostgreSQL, and is optimised to be deployed to Heroku. Features are:
- user registration, including email validation (with Mandrill for transactional email)
- basic examples of how to create something in the database, read it and update it.
- some basic test coverage

So far I've been adding features / things I like to see myself. But more importantly, to make this useful to a bigger audience: what would you like to see added? Where to you spend time what you rather use to create something unique?

Free Book: Clean Architectures in Python

6 points|thedigicat|6 years ago|2 comments

[I posted the original news here on HN on Christmas, so I hope this doesn't feel like I'm flooding the site with self-advertisement]

On Christmas 2018 I published on Leanpub a free book, "Clean Architectures in Python". It's a humble attempt to organise and expand some posts I published on my blog in the last years.

You can find it here: https://leanpub.com/clean-architectures-in-python

It has already been downloaded by 7,300 people and some of them were kind enough to support me with money. It already went through 7 minor revisions thanks to the help of some readers who spotted typos and errors (revision 8 is in the works). I want to say thanks to everyone who downloaded and read it. It's really good to know that some people find it helpful for their career.

The book is divided in two parts, this is a brief overview of the table of contents

* Part 1 - Tools

- Chapter 1: Introduction to TDD

- Chapter 2: On unit testing

- Chapter 3: Mocks

* Part 2 - The clean architecture

- Chapter 1: Components of a clean architecture

- Chapter 2: A basic example

- Chapter 3: Error management

- Chapter 4: Database repositories

Some highlights:

- The book is written with beginners in mind

- It contains 3 full projects, two small ones to introduce TDD and mocks, a bigger one to describe the clean architecture approach

- Each project is explained step-by-step, and each step is linked to a tag in a companion repository on GitHub

The book is free, but if you want to contribute with money I will definitely appreciate the help. My target however is to encourage the discussion about software architectures, both in the Python community and outside it.

I hope you will enjoy the book! Please spread the news on your favourite social network and thanks for downloading it.

Ask HN: How do you enforce coding best practices?

5 points|jacobevelyn|5 years ago|2 comments

Over the course of my career as a software engineer I've become a bigger and bigger proponent of using automated tooling—linters, static security scanners, tools that check database migrations for safety, etc.—in our CI system to enforce best practices and reduce risk.

But I'm wondering whether these solutions amount to a "local maximum." Is running a handful of checks in CI (and maybe in git commit hooks as well) the best we can do, or are their other approaches that you've used successfully? What do FAAMNG do?

Ask HN: As a junior, how do I deal with a lack of guidance from my boss?

4 points|cloud_line|5 months ago|6 comments

Background:

I started as a back-end dev apprentice a year ago. I moved up to junior six months later. My primary job is to build REST and SOAP APIs. I report directly to my boss who is the head of all three dev teams. For a while, it was just my boss and myself writing the back-end code, until a few months ago when he hired two more back-end apprentices.

Issue 1: Feeling Isolated

This is an in-office position where I spend nearly the entire work week alone. It's not unusual for two weeks to go by without me seeing my boss. Before I became a programmer, I thought I would love that. It turns out, I'm really struggling with it. I find the lack of human interaction actually leaves me feeling isolated in a negative sense. It's gotten to the point where I'm questioning if programming was the correct career transition.

Issue 2: Lack of Guidance

I feel like there's a severe lack of guidance on our team. For instance, when I interviewed for the position, we discussed that code reviews were mandatory. In the past year, I had one or two in-person code reviews, and one or two via email. All the rest of my projects have been pushed to remote, without any code review. Granted, we have separate "development" and "production" branches, and many of those projects are not yet in production. So it's possible we will review the code eventually. But given how busy my boss' schedule is, I find it unlikely.

The company's codebase lacks organization. For example, most of our APIs have controllers with sometimes several thousand lines of code. Although my experience is minimal I've managed to self-teach myself a little bit about the repository design pattern and the service-layer approach. I recently convinced my boss that we should strive for better separation of concerns. He agreed, and we're now starting to move some of our database logic into a separate database access layer (DAL). Also, I've taken it upon myself to find YouTube videos for our back-end team to watch for group training. We'll be watching those videos next week and having a group discussion on how to better implement our APIs.

Why is this an issue? It isn't, per se. I feel as if I'm pushing the back-end team in the right direction. It's a good feeling. I want our codebase to be better than it is, and I want our team to be in the 21st century as far as coding practices are concerned.

I think the issue here is the small bit of stress I'm feeling toward taking on a role that isn't necessarily something I'm qualified for. I think at a bigger company, the design patterns and training videos would come from someone above me, whereas I'm a junior who has mostly self-taught everything I know. My boss isn't the traditional "senior" developer. He's a former server admin who over the years learned enough about programming and SQL to build out a back-end for the company. Now, we have hundreds of APIs and a team of over 10 devs that he's managing.

I've also become a mentor to one of the apprentices on our team. At lunch, he asks me questions about SQL, git, databases, HTTP requests, and so on. I'm glad that I can help him out. Again, it feels good to have a positive impact. But at the same time, I'm mostly teaching him things that I've taught myself using the web. What if what I've taught myself is incorrect?

Lastly, and relating back to the feeling "isolated," I organized a group code review for myself and the two other apprentices recently when my boss was out of town. A large part of why I organized this relates back to my struggle with being alone throughout the majority of my workweek. This isn't something that I thought would be such a struggle, but that entire week was spent by myself at my computer until I went so stir-crazy that I decided to organize a code review with the other two back-end devs. I think it had a positive impact. I looked at our git history and saw one of the devs had committed some of the changes we discussed.

Ask HN: Learning C: Books, Tools, Community

4 points|neuland|8 years ago|0 comments

TL;DR: What are the good books, tools, and community places? I'm a Linux user interested in user-level C applications like NGINX, Redis, SQLite, and Python bindings for some C libraries.

----

I'm in the process of learning C and am hoping that you all can help in the discovery process. It took me a while to figure out that the best material for learning is in books. And, that has me wondering: What else is obvious to everyone else except me?

For a general idea of where I'm at, I am intested in working with for user-level applications, especially in the networking and database area like NGINX, Redis, and SQLite. I'm also interested in writing Python bindings for some C libraries that I use often.

I'm a linux user and generally do everything in the terminal. So far, I've written a couple programs (a couple CLI programs for processing text and a simple HTTP service using Kore) and am looking to get into something a little bigger.

In terms of books/sites, I've read the following already:

- K&R

- The Standard C Library

- 21st Century C

- Expert C: Deep C Secrets

- Python 3 C API Docs

- The Linux Interface (reading now)

For tools, I'm already using:

- Vim

- GCC

- GDB

- Make

- Valgrind

- Autotools

- `man 3 {some-libc-thing}`

- `man 2 {some-syscall-thing}`

So, what am I missing? I'm especially wondering where all the community is.

- Where do people that write C gather and talk? Forums, Mailing lists, etc.

- What books/websites are best for learning C?

- What tools should I be using? Strace, something for profiling?

Ask HN: How do you organize your analytics/BI SQL code?

3 points|curl-up|2 years ago|3 comments

I'm primarily asking for the following scenario, but I'm interested in other relevant situations as well.

Imagine you are an analyst in a medium-sized company. Data analytics / Business insights team you're part of has somewhere between 5 and 20 people. Daily, you get questions from other departments for custom reports, dashboards and quick answers. Some of it is part of a bigger project that takes months and results in comprehensive dashboards, some of it is ad-hoc and will never be needed again, but you usually don't know if some simple quick question will turn into a bigger thing. However, in all of these cases, most of the actual "work" is about writing, rewriting, modifying and maintaining SQL code.

My question is: how do you manage all that code? Is there a git repo where you follow some nice structure you defined? Do you build everything in DBT? Do you use fancy tools like Metabase? Do you store it in your data warehouse itself (e.g. as a stored procedure or view)? Do you just write the code and then throw it out because, the next time you need something like this, database would have changed anyway (or a completely new one will become the main source of data), so it makes no sense to even store it?

I'm intentionally not specifying the "stack" here, as I'd like to hear what actually works well, not what is just the least-bad option for a particular tech stack.

I'm especially happy to hear from people who have had bad experiences with whatever they chose to do, as I'm in the process of trying to solve this problem for my team and would really like to avoid as many mistakes as possible.

Ask HN: How do you organize your analytics/BI SQL code?

3 points|curl-up|2 years ago|1 comments

I'm primarily asking for the following scenario, but I'm interested in other relevant situations as well.

I'm intentionally not specifying the "stack" here, as I'd like to hear what actually works well, not what is just the least-bad option for a particular tech stack.

Ask HN: Should I use a blockchain as a keyserver?

2 points|thalassophobia|3 years ago|1 comments

AFAIK, one of the bigger problems with implementing PGP on a wider scale is the need to trust a keyserver to provide the correct open keys and update them when necessary. If so, can the database be decentralized via a blockchain so that no entity has absolute control over it? Key rotation can be implemented via additional blocks containing new keys signed by the previous ones. Proof-of-work also functions as a kind of CAPTCHA to solve potential spam issues. Since the rate at which new keys are added shouldn't be high, low transaction speed doesn't matter either. Would this kind of setup work and if not, why?

Ask HN: MySQL expert anyone?

2 points|musikorama|15 years ago|1 comments

I need to get a MySQL expert to help me optimize my database.

Mainmusik.com is a high traffic music site which is now getting slower and slower as the database getting bigger and bigger.

Anybody here want to share some experience?
Thanks!

Ask HN: Does hashing only part of a file make sense as unique checksum?

1 points|sandreas|3 years ago|8 comments

Hey HN,

currently, I'm having a performance issue with my little side project `tonehub`[1]. It's a small Web API including a background indexer task for audio files in pretty early state.

Introduction: Sometimes I move an audio file to another directory, because the metadata changed. This results in losing all of the files non-metadata history (playback count, current position, playlists, etc.).

To overcome this, I implemented hashing via xxhash only for the audio-part of the file skipping the metadata part. If a file is indexed, but its location is not found in the database, it hashes the file, looks it up and if a unique match is present, it updates only the location keeping the history and releations.

Now to my problem: It's too slow. I have many audio book files in m4b format, most of the time bigger than 200MB and hashing a file like this takes pretty long, long enough that indexing a whole library feels to slow in my opinion.

So I thought about following alternatives to improve that:

  - Hashing only a fixed length part of of the file (e.g. 5MB around the midpoint position, because of intros and outros are often the same)
  
  - Hashing a percentage size part (e.g. 5% of the audio data size)
  
  - Combine one of these "partial" hashes with a size check (e.g. hash=0815471 + size=8340485bytes, because hash and size collision may be less likely)?

It feels like that won't work. So I ask HN:

  Would one of these alternatives be enough to avoid collisions? 

  If so, what would be a "sufficient" part of the file and which alternative is the best?

Thank you

[1] https://github.com/sandreas/tonehub

Ask HN: Social app without the social part

1 points|bapbap|17 years ago|1 comments

I'm building an app that has a large social component to it but that's not the point of it. I'm not interested in building the next Facebook, or even an app with all the features of Facebook.

Is there any way I can build my app, let people have their profiles, friends, privacy settings etc all in my database but somehow integrate with MySpace, Bebo, Facebook and let people keep the majority of the social network malarky there?

I'm not entirely sure what I mean, it's much bigger than just a Facebook app and I need to store the information locally (like name, username, password, email etc) but I'd love to tap into people's networks stored elsewhere.

Any ideas? Thanks!

Show HN: Fantasy Sports with Supabase, Vercel, and Next.js (https://playfantasydraw.vercel.app/signin/password_signin)

1 points|playfantasydraw|7 months ago|0 comments

Have enjoyed playing fantasy sports for years. This year I have a league going with friends for a $500 prize pool.

It's tough to do a full team with drafts, because it takes so long to monitor injuries, set lineups, etc. So we came up with a new game mode. Just pick one quarterback per week, and if he throws for more yards than his real life counterpart, you advance. Person who advances the farthest without getting eliminated wins the $500. The catch is that each quarterback can only be picked once by each contestant throughout the whole NFL season.

I used Supabase to do the authentication and login/signup process. Supabase is also used for the database. For example when my friends and I login to make our weekly picks, those picks are stored in Supabase. The rest of the project uses Vercel and Next.js. I have read stuff online about people who wake up to unexpectedly high Vercel bills, so I'm hoping that doesn't happen to me (tips on how to avoid this are appreciated).

Right now I'm still coding and getting the site ready for a bigger launch with more contestants than just my friends. Open to feedback, suggestions, collaborations, etc.

Tell HN: Heroku deleted my database with no warning

460 points|fireworks|2 years ago|206 comments

Last December, Heroku nuked the database on one of my active projects. I was travelling at the end of the year and did not catch wind of this until I returned and saw messages about an issue with the app. Sure enough, I checked and noticed that the database was gone and detached on December 9th.

Before the hate comes out, yes I know Heroku deprecated free tiers. However, I did not understand this would affect my projects on paid dynos. The real issue here is that I never received a single email or notice of any kind to my email about this. From researching, it appears most people received SEVERAL notices about this. I did not think there was an issue with my setup because I received zero communication.

Upon reaching out, Heroku has told me that they cannot recover the database. They also admitted that there was "an issue" sending out notifications to me, and confirmed that none were sent.

So I guess just a warning to all - your database might be nuked at any time. I learned my lesson about not doing an offsite backup regularly. I guess the bigger lesson though is that Heroku should really be a last resort option for projects these days. RIP.

Help, what should I do when 2 founders have had a falling out?

21 points|gogy|15 years ago|17 comments

--- Introduction ---

More than three years back, three of us (all classmates from college) decided to take the plunge and start up. I know one of the founders very well; we did a lot of projects together in college. I was also working with him (in the same company) before we decided to start up. We both worked at a small software development firm. (Let’s call him Jerry) The other co-founder was a very good friend from college and had similar ideas about starting up. He was working for a fairly large software development firm before starting up with us. (Let’s call him Tom)

We decided to put in our own money and initially started out creating our own product, a web based event planning tool. Our initial plan was to create a basic version of the application and depending on the response create a premium (paid) version for event organizers. Unfortunately, at that time, event organizers in India weren't really interested in the idea. We started looking for consulting work and received a few local and overseas projects. This definitely boosted our confidence and our bank balance.

[While working together we noticed that the way me and Jerry worked was different from the way Tom was used to working. Tom wanted a proper document in place which would describe exactly what needed to be done. While me and Jerry didn't have any qualms about having everything in writing before starting on a project. As projects would progress, we would realize that we had missed something and it would require some more effort. This is where we saw things differently, Tom would sometimes flatly refuse to work on the change or be upset because now he would have to work on something that wasn't mentioned earlier on. Me and Jerry would not think much of it and go ahead and get it done. This had caused issues between Jerry and Tom, which eventually led to a discussion in which we tried to make Tom understand that requirements change and we aren't a big software company and following processes which are prevalent in bigger software companies would not be feasible for us]

Over the course of one year we completed couple of more projects. We also managed to sign on a large project which would involve us hiring 3 more developers. The contracts were signed, we finish hiring and had started the initial requirements gathering phase when suddenly the client wanted to put the whole project on hold for a couple of months (this was during the slow down around Q1 '09), the client eventually backed out completely. We were in a very precarious situation as we had hired people and had no projects that were currently running (never put all your eggs in one basket). As this was a recession, getting projects was very hard, and we eventually had to dip into our savings to keep our startup afloat.

After a few very uncertain months we slowly started receiving work. We found a project which would involve us dedicating one person to it continuously (let’s call it the US project). Tom started working as a dedicated resource for this project since he had experience in the technology they were looking for. Simultaneously we started getting couple of small projects which would keep everyone mostly busy.

At this time the project that Tom was working on would make up the significant portion our revenues and is what kept us afloat till things picked up.

Tom (after about 6 months) started training one of our developers (let’s call him Patrick) to work on as the dedicated resource for the project. For quite some time Tom & Patrick would both be working (the knowledge transfer period was quite long) on the US project. Tom still spends at least a couple of hours every day helping Patrick.

Me and Jerry have been working on multiple projects as we have been getting a steady stream of projects and have more than we can currently take up. Most of us have been working at least 12 - 14 hours in a day.

In June this year Jerry and Tom had a heated argument on the direction the company should take. We all had a long discussion on our current situation and future projections and what would be our roles moving forward. We eventually came to the conclusion that Jerry & Tom would start working on sales and marketing while slowly reducing the time they spent writing code, and I would continue handling the development teams. Jerry was already handling most of the client interaction while Tom was mostly handling client interaction for the US project and one other project. We also decided we would hire more people.

--- Current Situation ---

Currently we are a small (8 people) moderately profitable software company.

Tom spends a couple of hours a day on the US project and the other time on another project, he also helps us when we need some database advice (he's the database expert).

Tom had taken up the task to update our website (we haven't updated our website for over a year). He made a list of pages that would need to be updated. He had informed us of the same and had asked us to pick up the pages we wanted to write content for; he would write the content for the rest. As we were really busy completing projects we did not pick any topics and Tom didn't mention the website to us again. Tom was also helping Jerry out on a project and Jerry had already asked him to complete a set of tasks that the client required.

Me, Jerry and Nick are working on about 5 projects, depending on the deadlines and requirements we may work on 3 projects in a day. Jerry is also spending some time in getting our newly hired developers up to speed on the way we work.

A few days back our client during one of our weekly calls mentions that he would like a feature implemented in a different way. Jerry agrees as the newer implementation was more elegant.

Initially Jerry was supposed to complete this phase of the project, but after the requirement change Tom would be the ideal person to implement this (some part of the processing which would happen in the application could be done directly in the database)

After the call Jerry sends an email telling Tom of the requirement change and asking him on how much time it would take him to implement the change given his other commitments.

To which Tom responds he would not be able to make the changes alone and would require Nick to implement the changes.

Jerry responds saying that he and Nick are working on 3 projects together and it would take some time till Nick gets free, and also asks him why he would not be able to implement the changes alone.

Tom responds with an email telling Jerry that he's being unprofessional. Earlier Jerry had mentioned that he would be handling this phase of the project, and now he wants Tom to handle it. Tom accuses Jerry of trying to push tasks down his throat. He also mentions that the last time he worked on implementing something similar was 2 years ago and would require at least 5 days in total to understand their schema and implement the changes alone. He also mentions that other projects were taking up time and he hadn't spent any time writing content for the website and as none of us had volunteered to help, he would have to write all the content on his own. In the end he mentions it would not be possible for him to implement the change and tells Jerry to ask someone else for help.

After which Jerry responds saying that requirements change and he should help out, ending with we all need to have a talk.

Now during the discussion we have
- Tom mentions that we both have not volunteered to write content for the website.
- I respond by saying I am not very good at writing and currently I am swamped with a lot of work. I apologize for not replying to his email about the website content, and also not communicating to him why I cannot take up anything now.
- Jerry responds by saying that he's swamped too, he's working on three different projects simultaneously, and says he’s been working around 12-14hrs and on weekends for the past month.
- Jerry accuses Tom of not putting in as much effort as the rest of us are putting in.
- Tom responds by asking if we consider the website to be important (as it has not been updated for over a year).
- Jerry responds by saying that current projects take a priority, and suggests we could consider updating the website when work is less hectic (once we have transitioned the work to the new developers we hired).
- Tom responds by saying that he envisioned all us founders to be done with writing code by this point, and expected us to all be managing the developers or the sales/marketing side of things.
- I try to explain to Tom that we are still very small, we cannot afford really experienced developers, I can't envision running our startup successfully without all of us coding for at least part of our time every day for at least couple of months more. I explain that we need to focus more on the marketing and sales side of things but we can’t just jump into doing it fulltime right now.
- Tom says that we should use the developers we just hired to complete the projects.
- I explain that they are fresh out of college and need to be trained before we can trust them to handle a project and deliver it successfully.
- Tom disagrees. He says that Jerry is trying to push tasks down his throat.
- Jerry says he can’t work with Tom any longer, and would like to sell his stake.

As of now we have decided to meet on Monday so that everyone has time to think on how we want to take this forward.

The way I see it is either Jerry or Tom would be leaving, I don’t see the startup surviving at this stage without any one of the founders.

Has anyone been in a situation similar to this? I would love some advice.

Which tool should I use to build a simple dynamic website in 2020?

14 points|livingpunchbag|5 years ago|15 comments

Experienced C programmer here. I need to build a very simple website for personal usage with:

- Password-protected login to hide every single page behind

- A single sqlite database with possibly a single table (maybe a second single-line table to store my login + pw-hash + salt)

- A single page to upload stuff to the database. Some kind of asynchronous JS instead of HTML forms may be preferred.

- A single page that always shows the same database query

- Very infrequent access to the site (not more than once per day)

- There's basically zero chance that this would grow to something bigger. Maybe just extend the site functionality to allow my wife to also use it, and maybe find a way to interact with it from command line on Linux (but probably by doing HTTPS requests).

I did something like this with PHP back in 2007, but I know PHP is not very recommended. I worry that trying to learn a whole framework like Ruby on Rails or Django may be overkill. I really like Ruby as a programming language (although I never used RoR), so sticking with it would be cool, but Dreamhost says I should avoid it since it uses use too much memory and I'm on a shared plan. I also have some experience with cgi-bin in python, but I'm also told cgi-bin should be avoided just like PHP. I'm also aware that I could use some sort of server-side javascript, but there are a trillion frameworks to choose from.

What would you use today? What would you recommend?

Ask HN: What database should I use for analytics data?

11 points|AdriaanvRossum|5 years ago|18 comments

Let me rephrase that. I'm running Simple Analytics where we collect page views, events and specific details about a page view (url, time, referrer, utm_source, screen width ...).

At the moment we use PostgreSQL which serves us pretty well. We have some caching tables setup and with our processing scripts we have the data in customers dashboards within 2 minutes. We move our raw visits in tables with visits aggregated per hour and per day.

This is all working fine. But we want to start targeting bigger clients now and would love to be fine then as well.

A few things that are important to us when selecting a database:

- it is very popular (we want to solve issues fast)

- it is free to use (the license)

- it is easy to maintain (very important)

- it can run on my own servers

One that really appeals to us is Apache Cassandra with Apache Spark.

But we're not sure if we should optimize our PostgreSQL workflow. I wouldn't mind investing 1 or 2 weeks of my time into setting up a new database, but it should also save time in the future. The common queries we will have are (see our demo dashboard to see what we show [1]):

- get (unique) page views from a website grouped by per 24 hours

- get list of most visited pages and show those with the percentages of referrals aggregated by day (/contact: twitter.com 20%, organic 80%)

- get conversions between events of sessions/list of events

There are probably many tools out there to help, but I wouldn't use a sledgehammer to hammer a nail.

[1] https://simpleanalytics.com/simpleanalytics.com

Ask HN: How to create a small unique fingerprint of a 25h+ audio book?

5 points|sandreas|6 years ago|5 comments

I'm the author of m4b-tool (https://github.com/sandreas/m4b-tool) and I noticed, that there are not many sources for audio book chapters (especially non english ones) to tag my audio books correctly.

So i decided to create my own database for chapters, but at the moment i am struggling how to identify audio files correctly.

With ffmpeg extension chromaprint it is possible to create fingerprints of audiofiles:

  ffmpeg -i "input.mp3" -f chromaprint fingerprint.txt

This works nicely, but the longer the audio file, the bigger the fingerprint (which is reasonable). Since i would like to store the fingerprint in a database, the smaller it would be, the better.

But for a 25h+ audio book the process takes extremely long and produce a ±5MB (!) fingerprint file.

I can think of three ways to solve this problem:

  - Only take the first X Minutes of the audio file (fast, relatively small fingerprint storage, but inaccurate)

  - Hash the full fingerprint with e.g. sha512 (small fingerprint storage, but slow and is this accurate?)

  - Hash the X Minutes fingerprint (fastest, small fingerprint storage but most inaccurate)

Which would be the best way?
Are there other ways, i did not think of?

Thank you

Ask HN: Personal, local, (portable?) Knowledge Database

2 points|cyxxon|9 years ago|2 comments

Is there an easy to use personal knowledge database you can recommend? I am trying to bring order to the chaos of ten years of putting reusable code snippets, infos on a specific weird API call, problems I might come across at a different customer (I am a consultant), etc. into text files, Word docs, PDF files, what have you. I tried to follow some sort of file system hierarchy, but tagging and categorizing would be better - do I put it under some code keyword, or under the topic? What if it fits more than one topic?

I tried TiddlyWiki and put it into Dropbox, but soon realised I cannot really add arbitrary attachments, which makes sense given the nature of the Wiki. The idea of just having one syncing file is still nice, though. Maybe something bigger, but following the same principles, like one file (maybe an archive or so), or a directory?

I know I could just put up some other Wiki software on a host of maybe some Atlassian knockoff, that's not the problem, but then in real life I still don't always have good network access everywhere (for some obscure reason many companies still restrict network access, have shitty proxy configurations, etc.). At the other end some local software that only works on a specific OS and the DB only on one laptop is also not convenient - and I have to overcome the urge to just add this new tidbit to the old file dump and get around to using the new way soon(TM)...

Tell HN: My startup is making money and I don't know what to do

211 points|sthielen|11 years ago|128 comments

I started University Niche[0] with two friends to help college students find places to live off-campus around their universities.

If you've been through the college system in the US, chances are you know how difficult it is to find a house to rent. Many landlords will simply hang up the phone the second you tell them that you're a college student. University Niche is a database of rental properties that are open to renting to college students, with information curated specifically toward helping students find the best places to live.

We launched April 2014, and are now at three universities. We've seen a lot of success (~40% w/w growth, 100%+ m/m growth).

For many students, this is their first time living on their own, so we wanted to sell ad space to small businesses that would help students with their move (storage, furniture, etc.) and then to small businesses that would help them once they're situated (grocery stores, gyms, etc.). I coded up a basic self-serve advertising platform, and my cofounders and I went and found a local moving company. Walked in the office, talked with the owner for about 15 minutes. Sale.

Went to a mattress store. Sale.

Small restaurant, same thing.

We've walked in to six random businesses, and only one said no.

So we're pretty excited it works, but we have bigger plans than this. We have projections and budgets to expand to 150 schools nationwide.

We know how we're gonna do it and we know how much it will cost, but we are all at a loss for how to take that next step.

We have a validated product and now a validated revenue model. We've got traction. But we don't have the funds to take this thing to the next level, and we don't know how to meet the people who can help.

Anyone have any advice or been in a similar situation?

[0]http://universityniche.com

Launch HN: Synth (YC S20) – Realistic, synthetic test data for your app

121 points|openquery|5 years ago|48 comments

Hey!

Christos, Damien and Nodar here and we're the co-founders of Synth (https://getsynth.com) - Synth is an API which allows you to quickly and easily provision test databases with realistic data with which to test your application.

We started our company about a year ago, after working at a quantitative hedge fund in London where we built models to trade US equities. Strangely, instead of spending time developing models or building the trading system, a large portion of our time was spent on just sourcing and on-boarding datasets to train and feed our models. The process of testing datasets and on-boarding them was archaic; one data provider served us XML files over FTP which we then had to spend weeks transforming for our models to ingest. A different provider asked us to spin up our own database and then sent us a binary which was used to load the data. We had to whitelist their API ip-address and setup a cronjob to make sure the dataset was never out of date. The binary provided an interactive input so it couldn't be scripted, or rather it could be but you need something to mock the interactive params. All this took a junior developer on the team a good 3-4 days to figure out and setup. Furthermore after our trial expired we decided we didn't actually need this dataset so those 3-4 days were essentially wasted. Our frustration around the status-quo in data distribution is what drove us to start our company.

We spent the first 6 months building a privacy-aware query engine (think Presto but with built in privacy primitives), but software developers we talked to would frequently divert the topic to the lack of high quality, sanitised testing data during the software development lifecycle. It was strange - most of us developers and data scientists constantly use some sort of testing data for different reasons. Maybe you want a local development environment which is representative of production but clean from customer data. Or a staging environment which contains a much smaller, representative database so that tests run faster. You could want the dataset to be much bigger to test how your application scales. Maybe you want to share your database with 3rd party contractors who you don't necessarily trust. Whichever way you put it, it's strange that for a problem most of us face every day, we have no idiomatic solution. We write bespoke scripts and pipelines which often break. They are time consuming to write and maintain and every time your schema changes you need to update them manually. Or we get lazy and copy/paste production.

We finally listened to all this feedback, dropped the previous product, and built Synth instead. Synth is a platform for provisioning databases with completely synthetic data.

The way Synth works can be broken into 3 main steps. You first download our CLI tool (a bunch of python wrapped up in a container) and point it at your database to create a model (we host the models on the Synth platform). This model encodes your schema, and foreign key relationships as well as a semantic representation of your types. We currently use simple regular expressions to classify the semantic types (for example an address or license plate). The whole model is represented as a JSON object - if the classifier gets something wrong you can easily change the semantic type. Once the model has been created, the next step is to train the model. Under the hood we use a combination of copulas and deep-learning models to model the distributions and correlations in your dataset (the intuition here is that it's much more useful for developers to have realistic data than just sample from a random number generator). The final step is to use the trained model to generate synthetic data. You can either sample directly from the model or we can spin up a database for you and fill it with as much data as you need. The generation step samples from the trained model to create realistic data, as well as utilising bespoke generators for sensitive fields (credit card numbers, names, addresses etc.)

You can run the entire lifecycle in a single command - you point the CLI tool at your database (currently Postgres, MySQL and MsSQL) and in ~1 minute you get an i.p. address and credentials to your new database with completely synthetic data.

We're long time fans of HN and are eagerly looking forward to feedback from the community (especially criticism). We've made a free version available for this week so you can try it with no strings attached. We hope some of you will find Synth useful. If you have any questions we'll be around throughout the day. Also feel free to get in touch via the site.

Thanks!
~ Christos, Damien & Nodar