Frequently Asked Questions

AI Interaction

Very much so. If you don't choose an AI model to interact, we choose it for you. In Q&A mode, we rotate the AI models until you choose one or more. In multi-turn interaction like chat or editorial, we choose one for you and remember until you select another AI model. Once you select an AI model for a session, that remains unchanged until you switch to another one. Do check out our AI model selection from 'AI Models' menu in left pen of interaction page.

The top left dropdown button offers various modes of AI interaction. You choose the right mode depending on the task you want to accomplish with the help of AI. This is a growing selection and we will continue to offer more interaction modes as with the advancement of AI. Here are the modes you can select -

  • Question & Answer (QA): The most simplest and fastest mode to get an answer of your question from AI. In AI jargon, we call it zero-shot response, i.e. AI will answer your question from it's own learned knowledge, without any supplementary information or hints. You can pass hints and additional information in your question, but that's all. You may want to tweak your question to get better answer from AI, but do note that every question is different, AI will neither memorize your previous question nor try to correct from previous answer. In Q&A mode, you can select multiple AI models to obtain answers from all of them and choose the best one. If you choose multiple AI models, you are encouraged to provide rating on those.
  • AI Debate (DB): Its fun to watch AI models are debating among themselves on the topic you requested. In debate mode, you just need to type in the discussion topic for AI. We randomly choose different versions from different AI family and start a debate session. Just sit back and read what aspects they bring in, where they agree and disagree and how they end the debate. At the end, you need to read and rate their answer to reveal the identity of each participant. Debate sessions are good way to compare the expertise level of industry leading AI models. also helps you to choose the right AI model for your specialized tasks.
  • Chat (CH): The Chat is the most common mode of interaction with AI as this brings the true power of AI. In this mode, you start an interaction with AI and continue the conversation until you achieve your goal. Just like the regular conversations, your conversation with AI will be assisted by our state-of-the-art AI memory. The memory makes AI more powerful and personalized for you, check out more on memory here. Another powerful feature in chat mode is you can switch AI model at the middle of conversation without losing the context, thanks to the decoupled memory which you can carry along with you.
  • Editorial (ED): The Editorial mode is good for tasks related to a document. In this mode, you start by uploading a document, and them start a editorial interaction with AI in the same context. You can do many things like asking AI to summarize the document, ask specific question, expand certain parts of the document, identify the grammatical errors, so on and so forth. Like Chat mode, Editorial mode also powered by our state-of-the-art AI memory, so offers all the goodness of Chat mode. Note that only text and pdf documents are supported at this moment.

You can think of the creativity scale as the degree of creativity or diversity you would expect in AI response. This is also known as Temperature setting of AI model in AI community. The temperature controls the consistency and creativity in AI model response. In lower temperature, the model returns consistent answer, which means, same question will get same answers most of the time. On higher temperature, the randomness increases, Hence the model offers different answers on same question, demonstrating its creativity. You should set lower temperature for a definitive answer like math question etc., whereas you may set higher temperature while asking AI model to write a poem. We set the default temperature value as 0.1, change it according to your expectation for the AI interaction session. We save your preference(including temperature setting) per session.

We imposed a limit of 4096 tokens from AI model, which is roughly 3072 words. Apart from that fact, occasionally the model can't finish its answer in those many words, hence the answer is truncated. AI model can offer lengthy answers, but we believe most of our questions can be answered within the limit. If you are in multi-turn interaction with AI like chat or editorial mode, you can always ask AI to finish its previous answer.

There are a few reasons why you may receive error from the service during your interaction with AI models. One of the well-known reason is the request limit imposed by the AI service provider. For example, if our OpenAI account only supports 5 requests per second, the 6th request will be rejected (throttled) and shows up as error. When many users are using our service simultaneously, such errors are inevitable. Occasionally AI model sends no answer (empty tokens) due to the restrictions put by the AI vendors, that also shows up as error in our platform. Lastly, our service may get overloaded and rejects some requests at times. If you witness consistent errors, please reach out to us via feedback form or email.

In the context of LLM, the token is a chunk of text that the model reads or generates. A token is typically not a word; it could be a smaller unit, like a character or a part of a word, or a larger one like a whole phrase. A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words). In addition to that, we add adds additional texts to your input before sending it to AI model; typically referred as prompt. A prompt generally consists of system instruction, your input and past context of your interactions, but not limited to those.

We want to offer you AI models from the industry leading AI model vendors. In case you know some AI model which is state-of-the-art and competes with existing AI models in our platform, please write back to us via feedback or email. Given enough requests, we may consider integrating with such AI models. We are not restricted by the current models or existing variants. We will evolve with the industry.

AI Memory

The Assistance and Editorial interaction modes are multi-turn interaction between you and AI models, which means, you continue the conversation with AI until you accomplish what you want from those. Hence conversational aka short-term memory is critical requirement for such multi-turn conversation. We encourage you to turn on contextual and cognizant memory in order to have more personalized and meaningful conversation with AI.

Auto-transfer is the powerful feature of your working memory, it resembles how we perceive events and interactions we memorize in our brain. The conversations aka short-term memory gets assimilated and moved to your contextual (aka mid-term memory) every few hours automatically. Similarly, the contextual memory gets transferred automatically to cognizant(aka long-term) memory every few days. The auto-transfer interval for conversational memory is set to 48 hours, same for cognizant memory is set to 15 days. You can change the default setting in Memory Management page. Just like human memory, you can't turn off memory formation, auto-transfer always runs in the background.

The memory is finite, when you learn new information, facts, skills or knowledge, you unlearn some past such events to make way for new events. Same is true for your cognizant aka long-term memory. Hence unlearning is a vital part of learning new information, skills or knowledge. We recommend you to leave the  Forget Past  option checked. By default, you can store no more than 50000 records in cognizant memory, but you can change the limit.

No, you can't erase your AI memory. Just how your brain works with external world, your AI memory is a type of simulation of your brain. Every minute details of your interaction with AI uses AI memory and makes some memorization, however insignificant it may be, exception being Q&A and Debate more of interactions where your are not really engaged in AI conversation. However, you have some control on your memory. For example, you can adjust the size of your memories, you can make transfer of interaction/knowledge from short-term memory to long-term memory faster etc. Please check out the AI Memory section for more controls and information.

Credit & Quota

We reserve the right to deny invite quota increase or onboarding invited guests in our community, mainly due to the limited resources we operate with. Having said that, we would like to grow our community with AI enthusiasts and curious minds. Do reach out via feedback or email, we will review your request and let you know our decision.

You accrue referral addon credits as soon as someone you invited joins our community. You receive addon credit of $20.0 as referral bonus, while the invited guest will receive addon credit of $20.0 as a new member of the community. Go ahead and invite a few AI enthusiasts you feel would make the community stronger. Please note that you cannot accrue more than $100 as addon credit to your account, however, we extend the validity period of your existing addon credit in such scenario.

Currently the service is absolutely free of cost and it will remain so until the end of 2025. Hence the dollar values associated with usage cost (including the itemized cost along different dimensions) are for informational purpose only. The amounts give you the idea of value of services we offer to our community members. The dollar values in Credit & Quata page put a limit of your usage of our AI services until the end of 2025. We will introduce pay-per-usage or subscription model for our service beginning 2026.

You need credit (dollar value) to avail various services on our platform, including interacting with AI models and maintaining multi-layer memory for multi-turn interaction. We introduced two types of credits for that purpose. Monthly credit is a fixed dollar credit which starts on the first date of every calendar month and expires on the last date of that month. It is monthly subscription model for your predictable usage of LLMProxy AI service. Note that unused credit for a month does not roll over to next month, unused amount is forfeited. You can request for monthly credit if you plan to use the service on a regular basis.

On the contrary, the addon credit is applied on demand and meant for unpredictable and additional usage. Addon credit comes with a validity period (typically in days and months depending on the dollar value) after which the remaining unused credit expires. You can request addon credit anytime, the dollar amount will be added on top of any valid addon credit available to you, and the entire amount will get the extended validity as with the newly added credit.

Yes, you can. In fact, we encourage you to maintain both monthly credit (for fixed usage) and addon credit (for variable usage). On the utilization aspect, the addon credit is considered before monthly credit, i.e. any unexpired addon credit will be utilized before monthly credit utilization. Since the service will remain free for all users until the end of 2025, you cannot purchase addon or monthly credit. You can only request additional addon or monthly credits via feedback or email.

You can track your credit utilization in the Credit & Quota section of your account. You can only see the current utilization of any valid credit applied to your account, expired and past credit will not appear on the page. Since the service will remain free for all users until the end of 2025, you cannot purchase addon or monthly credit. You can only request additional addon or monthly credits via feedback or email.

Daily quota broadly serves two purposes. One, it lets you control and limit your daily usage of the service. That way you can plan and distribute your credit over a period of time. You can adjust your daily quota on dollar value or token usage or both, depending on your usage pattern and budget planning. The other reason to enforce daily quota is to limit the usage of service per user. Since we serve with limited infrastructure and limited bandwidth of downstream AI services, the quota helps us to allocate fair share of the service for your usage. As we grow as a community, we will broaden the quota limits for your convenience.

We believe both token and dollar quota are necessary evils for serving the above purpose. Where the dollar quota helps you plan your credit utilization, token quota is essential for system and bandwidth limitations. We love to hear your feedback on the usefulness of these quotas and any suggestions around those.

Miscellaneous

We value every feedback and try to respond as early as possible. Any feedback that reaches us via feedback or email goes to the founding members. So don't hold back, reach out as and when you feel appropriate.

We are committed to provide you the full clarity and transparency on how AI models are performing for you in terms of speed, cost and user satisfaction. While token usage and associated price breakdown bring the clarity on your cost to the service, the performance charts bring other aspects like AI processing time, answer quality etc. We categorized the performance aspect in multiple tabs -

  • Your Performance: Shows performance of AI models based on your usage and ratings. With this, you will be able to analyze and review your strategy on the AI models. We started with four performance charts, (1) average time in seconds taken by AI models to answer, (2) average user satisfaction in a scale between 1 to 5, (3) average number of tokens (input and output combined) used per call to AI model during the time period you selected, (4) average cost in dollar value per call to AI model, (5) total number of calls made to the AI model during the selected time period and (6) total number of AI model responses user flagged during the time period.
  • World Performance: Depicts performance of AI models based on all usage and ratings of LLMProxy AI community members, including your usage and ratings. The charts are same as in  Your Performance. This tab gives you the world view of AI model performance and where you stand today.

Lastly, all the performance metrics are calculated over a time window. You may want to change the time period, as small as last three hours to as big as last one year. The default time period is set as one month.